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OH protein - protein search, using sw model 
Run on: : 



Title: 
Perfect score: 2764 



April 25, 2006, 09:07:29 ; Search time 42 Seconds 

— 1 ■ " (without alignments) 

1216.455 Million cell updates/sec 

US-10-721-553-2 



Sequence: 
Scoring table: 

Searched': 



1 MAPTIQTQAQREDGHRPNSH QEDGSEAAASDSSEADSDSD 531 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



283416 seqs, 96216763 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



283416 



Database: 



PIR_80:* 
pirl : * 
pir2 : * 
pir3 : * 
pir4 : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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RESULT I s 
T20261 

hypothetical protein C55A6.9 - Caenorhabditis elegans 
Ci Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 #sequence_revision 15-Oct-1999 #text_change 09-Jul-2004 
C; Accession: T20261 
R; Kershaw, J. 

submitted to the EMBL Data Library, October 1996 
A;Referehce number: Z19243 
A; Accession: T202 61 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DMA 
A; Residues: 1-425 <WIL> 

A;Cross-references: UNIPROT : P90783; UNIPARC : UPI00000748D2 ; EMBL: Z81051; 

PIDN:CAB02869.1; GSPDB : GN00023; CESP:C55A6.9 

A; Experimental source: clone C55A6 

C/Genetics : 

A; Gene: CESP:C55A6.9 

A; Map position: 5 

A;Introns: 14/2; 48/2; 90/3; 177/3; 381/1 



: Query Match 23.3%; Score 645; DB 2; Length 425; 

'Best Local Similarity 33.1%; Pred. No. 1.5e-26; 

Matches 146; Conservative 96; Mismatches 165; Indels 34; Gaps 10; 
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RESULT 2. 
C96828 

unknown protein F19K16.29 [imported] - Arabidopsis thaliana 
C; Species: Arabidopsis thaliana (mouse-ear cress) 

C;Date: 02-Mar-2001 #sequence_revision 02-Mar-2001 #text_change 09-Jul-2004 
C;Accession: C96828 

R;Theologis, A.; Ecker, J.R.; Palm, C.J.; Federspiel, N.A.; Kaul, S.; White, 0. 
Alonso, J.; Altaf, H.; Araujo, R. ; Bowman, C.L.; Brooks, S.Y.; Buehler, E.; 
Chan, A.; Chao, Q. ; Chen, H.; Cheuk, R.F.; Chin, C.W.; Chung, M.K.; Conn, L.; 
C6nway, A.B.; Conway, A.R.; Creasy, T.H.; Dewar, K . ; Dunn, P.; Etgu, P.; 
Feldblyum, T.V.; Feng, J.; Fong, B.; Fujii, C.Y.; Gill, J.E.; Goldsmith, A.D.; 
Haas, B.'; Hansen, N.F.; Hughes, B.; Huizar, L. 
Nature 408, 816-820, 2000 

A;Authors: Hunter, J.L.; Jenkins, J.; Johnson-Hopson, C; Khan, S.; Khaykin, E. 
Kim, C.J,; Koo, H.L.; Kremenetskaia, I.; Kurtz, D.B.; Kwan, A.; Lam, B.; Langin 
Hooper, S.; Lee, A.; Lee, J.M.; Lenz, C.A.; Li, J.H.; Li, Y.; Lin, X.; Liu, 
S^X.; Liii, Z.A.; Luros, J.S.; Maiti, R. ; Marziali, A.; Militscher, J.; Miranda, 
Ml; Nguyen, M. ; Nierman, W.C.; Osborne, B.I.; Pai, G. ; Peterson, J.; Pham, P.K. 
Rizzo, M.; Rooney, T . ; Rowley, D. ; Sakano, H. 



A;Authors: Salzberg, S.L.; Schwartz, J.R.; Shinn, P.; Southwick, A.M.; Sun, H. ; 
Tallon, L.J.; Tambunga, G.; Toriumi, M.J.; Town, CD.; Utterback, T.; van Aken, 
S.; Vaysberg, M . ; Vysotskaia, V.S.; Walker, M . ; Wu, D. ; Yu, G.; Fraser, CM.; 
Venter, J.C; Davis, R.W. 

A; Title :; Sequence and analysis of chromosome 1 of the plant Arabidopsis. 

A; Reference number: A86141; MUID: 21016719; PMID : 111307 12 

A; Accession: C96828 

A; Status; preliminary 

A, : Molecule type: DNA 

A) Residues: 1-547 <STO> 

A; Cross-references: UNIPROT : Q9CA82 ; UNIPARC : UPI00000A4 648 ; GB:AE005173; 
NID:g6453869; PIDN : AAF09053 . 1 ; GSPDB : GN0014 1 
C;Genetics : 
A;Gene: F19K16.29 
A; Map position: 1 

: Query Match 12.1%; Score 335.5; DB 2; Length 54 7; 

Best Local Similarity 24.8%; Pred. No. 2.1e-10; 

.Matches 121; Conservative 69; Mismatches 136; Indels 161; Gaps 20; 

Qy 6 QTQAQRED-GHR PNSHRT LPERSGVVCRVKYCN 37 

: : I I : I I I III:: I : : : I : : I : I 

Db ! 14 3 ELEKQRQDEKHRQQMKNSHKSQMPKGHTEEKKPTPLLTTDRVENRLKKPTTFICKLKFRN 202 

Qy 38 SLPDIPFDPKFIT YPFDQNRFVQYKATSLEKQHKHDLLTEPDLGVTIDLINPDT 91 

Ml |:| : I I 11:1 I I I I I I : I I I I I : : I I : : 

Db i 203 ELPDPSAQLKLMTIKRDKDHYFDPTRFTKYTITSLEKLWKPKIFVEPDLGIPLDLLDLSV 262 

Qy 92 YRIDPNVL— LDPADEKLLEEEIQAPTSSKRSQQHAKVVP WMRKTEYISTEFNR 143 

I II I I I I : II : : I I I : II I : I I : I I I : 
Db :: 2 63 YN-PPKVKAPLAPEDEELLRDD-DAVTPIKKDGIRRKERPTDKGMSWLVKTQYISS 316 

Q^ ! 144 YGISNEKPEVKIGVSVKQQFTEEEI YKDRDSQITAIEKTFEDAQK 188 

|:|| I : I II:: :|: II II :ll I I 

Db I 317 — INNE SARQSLTEKQAKELREMKGGINILHNLNNRERQIKDIEASFE-ACK 365 

Qy ; 189 SISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDS DPAPKDTSGAA 236 

I I : : I I I I : I : I I : II I I : : I : 

Db '; 366 SRPVHATNKNLQPVEVLPLLPYFDRYDEQFVVANFDGAPIADSEFFGKLDPSIRDAHESR 425 

Qy ; 237 ALEMMSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYN 296 

I S | : : : : | : I : I I : I . : I I I : I I : I Ml 
Db I 426 AI— LKSYVVAGSDTANPEKFLAYMVPSLDELSKDIHDENEEISYT WVREYL 475 

q^ ■ 297 WNVKNKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLVVKHRDM 356 

I : I : I : II : I I : I I I 

Db : 476 WDVQPNAN DPGTYL VSFDNGTASYLVYSSR-- 505 

Qy : 357 NEKELEAQEARKAQLEN HEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKE 4 07 

: I : : : I I : MM:: : : : 

Db ; 506 VGASSSKMRRLEDEGGLGRSWKHEPEQD ANQYSD 539 

Q^ ' 408 GSEDEHS 414 

I : I I : : I 

Db ' 540 GNEDDYS 546 



RESULT 3 
T50233 : 

probable : DNA-directed RNA polymerase II regulator [imported] - fission yeast 
(Schizosaccharomyces pombe) 
C; Species: Schizosaccharomyces pombe 

C;Date: 09-Jun-2000 #sequence_revision 09-Jun-2000 #text_change 09-Jul-2004 
C;Accession: T50233 

R/Cadieu, E.; Lelaure, V.; Galibert, F . ; McDougall, R.C.; Rajandream, M.A.; 
Barrell, .E.G. 

submitted to the EMBL Data Library, January 1999 
A; Reference number: Z25048 
A \ Accession: T50233 

a) Status : preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A; Residues: 1-457 <CAD> 

Ai Cross-references: UNIPROT : Q9US06; UNIPARC : UPI000006A70C; EMBL: AL136235; 
PjDN:CAB65804 . 1; GSPDB : GN00066; SPDB : SPAC664 . 03 
A; Experimental source: strain 972h(-); cosmid c664 
C, : Genetics : 

A; Gene: SPDB : SPAC664 . 03 
A; Map position: 1 
A;Introns: 1/3 

.Query Match 10.1%; Score 280.5; DB 2; Length 4 57; 

|Best Local Similarity 24.9%; Pred. No. 1.2e-07; 

\ Matches 122; Conservative 94; Mismatches 192; Indels 81; Gaps 21; 

Qy 26 RSGVVCRVKYCNSLPDIPFDPKFITYPFDQNRFVQYK ATSLEKQHKHDLLTEPDLG 81 

« I : I I : I I I I I I I I I I I II : : I : : I : : : I I 

Db 6 RQDYILRVRYHNPLPPPPFPPKLINIP NPVKQYALPNFVSTLVQEKKIPIENDIELG 62 

Qy 82 VTIDL INPDTYRID PNVLLDPADEKLLEEEIQAPTSSKRSQQHAKVVPWMR 132 

; : : I I ||: : I I I I I I I : : I : I I : : I 

Db 63 MPLDLAGITGFFEGDTSWMHSDLSSVNLDPIDRSLLK VAGGSGSTHLE-VPFLR 115 

Qy : 133 KTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEE — EIYKDRDSQITAIEKTFEDAQKSI 190 

: I I I I I : I I : : : : : : I : I : : : I : I I : : I : I : : 

Db : 116 RTEYISSEVAR--AASNRGNLRLTASTSKALAEQRGRSLREVPKQLEAINESFDVVQQPL 173 

Qy ; 191 SQ — HYSKPRVTPVEVMPVFPDFKMWINPCAQVI FDSDPAPKDTSGAA 236 

1 11:11:11 : I : I : I : : I : : 

Db ; 174 EQLKHPTKPDLKPVSAWNLLPNTSMAGIQHLMLRVADDLSERSHSYSSLVNLQEGHNLTK 233 

Qy ^237 ALEMMSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIA 292 

i I I I I I | ! : | : : | : M I I I : : : : : : I II:: : 
Db : 234 RHEVALFMPSSA EGEEFLSYYLPSEETAE E I QAKVN DAS ADVHE P FV Y 281 

Qy ! 293 REY-NWNVKNKASKGYEENYFFIFR EGDGVYYNELETRVRLSKRRAKAGVQSGT 34 5 

: I : : : I : I I : I I : I I : I : I I 

Db j 282 NHFRNFDASMHVNSTGLEDLCLTFHTDKDHPEANQVLYTPIYARSTLKRRHVRAPVSLDA 341 

'34 6 NALLVVKHRDMNEKE-LEAQEAR — KAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGS 4 02 
: : I I : I : : I I : : I I I I : I I I I I : : : I I : I I 
Oh . 342 VDGIELSLRDLNDEESLQLKRARYDTFGLGNIKDLEEEEEKLRSVE GSLNEE L 394 

Q^ 403 SSEKEGSEDEHSGSESEREEGDRDEASDKSGSGEDESSEDEARA-ARDKEEIFGSDADSE 4 61 

I I : : : I : I : I : I : : I : I I : I : I III: I 



Db : 395 SEEEKPAESREQLESAEQTNGVKPETQAQNMS ASESQANSPAPPVEE--GNTQPSP 4 48 

4 62 DDADSDDED 4 70 
: : : I I 
Db .44 9 VEQLQNEED. 4 57 

RESULT 4i 
S44541 

hypothetical protein YBR279w - yeast ( Saccharomyces cerevisiae) 
N; Alternate names: hypothetical protein YBR2016; Pafl protein 
C; Species: Saccharomyces cerevisiae 

C;Date: 08-Jun-1994 #sequence_revision 09-Sep-1994 #text_change 09-Jul-2004 
C,;Accession: S44541; S46161; JC6088; PC6031; S39135 
R;Holmstrom, K . ; Brandt, T.; Kallesoe, T. 
Yeast 10;(Suppl.A) , S47-S62, 1994 

A;Title:;The sequence of a 32420 bp segment located on the right arm of 

chromosome II from Saccharomyces cerevisiae. 

A;Reference number: S44537; MUID: 94378722; PMID:8091861 

A; Accession: S4 4 541 

A; Status: translation not shown 

A; Molecule type: DNA 

A; Residues: 1-445 <HOL> 

A; Cross-references: UNIPROT : P38351 ; UNIPARC : UPI0000053037 ; EMBL:X76053; 
NlD:g600025; PIDN : CAA53642 . 1 / PID:g429124 

R;Brandt/ T./ Christiansen, C; Holmstroem, K. ; Kallesoe, T. 

submitted to the Protein Sequence Database, August 1994 

A; Reference number: S4 6157 

A; Accession: S4 6161 

A; Molecule type: DNA 

A;Residues: 1-445 <BRA> 

A/Cross-references: UNIPARC : UPI0000053037 ; EMBL:Z36148; NID:g536721; 
PtDN:CAA85243. 1; PID:g536722; MIPS:YBR279w 

R;Shi, X : . ; Finkelstein, A.; Wolf, A. J.; Wade, P. A.; Burton, Z.F.; Jaehning, J. A. 
Mpl. Cell. Biol. 16, 669-676, 1996 

A;Title: ! Paf lp, an RNA polymerase II-associated factor in Saccharomyces 

cerevisiae, may have both positive and negative roles in transcription. 

A;Reference number: JC6088; MUID: 96140434 ; PMID: 8552095 

A; Accession: JC6088 

A; Molecule type: DNA 

A) Residues: 1-166,168-445 <SHI> 

A/Cross-references : UNIPARC: UPI0000179A39 

A; Experimental source: strain YJJ453 

AjAccession: PC6031 

A; Molecule type: DNA 

A/Residues: 5-ll;420-427 <SH2> 

Aj Cross-references: UNIPARC : UPI0000179A3A; UNIPARC : UPI000017 9A3B 
C;Comment: This factor is a highly charged nuclear protein, and acts as a 
c6factor : important for transcriptional activation and repression from diverse 
promoters . 



Genetics : 
Gene: SGD:PAF1 

Cross-references: SGD: S0000483; MIPS:YBR279w 
Map position: 2R 

Note: this gene is located at the right arm of chromosome II 
Superfamily: Saccharomyces cerevisiae hypothetical protein YBR279w 
Keywords : nucleus 



F;25-4 9/Region: PEST sequence 

F,; 119-141 /Region: nuclear location signal 

Query Match 9.2%; Score 253; DB 2; Length 4 45; 

iBest Local Similarity 22.1%; Pred. No. 3.2e-06; 
♦Matches 112; Conservative 96; Mismatches 184; Indels 114; 



Qy 

Db 

Qy 

Db 

t 

Db 

Qy 
Dd 

Qy 

Db 

Qy 

Db 
Qy 
Db 

Qy 

Db 
Qy 
Db 



Gaps 21; 

23 LPERSGVVCRVKYCNSLPDI PFDPKFITYP FDQNRFVQYKATSLEKQHK 71 

: : : : : I I I I I I I I : I I : : : : I : I : I 

1 MSKKQEYIAPIKYQNSLPVPQLPPKLLVYPESPETNADSSQLINSLYIKTNVTNLIQQ-- 58 

72 HDLLTEPDLGVTI DLI NPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKRS 121 

: I I I : : I I : : I I II II I II : I : : 
59 DEDLGMPVDLMKFPGLLNKLDSKLLYGFD-NVKLDKDDRILLRD PRIDRLT 108 

122 QQHAKVVPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQ 175 

: I : : I : I I I : I : : : I : : : III 
109 KTDISKVTFLRRTEYVSNTIAAHDNTSLKRKRRL DDGDSDDENLDV 154 

17 6 ITAIEKTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDT 232 

I : : I I I I Mill: : I I II I 
155 NHIISRVEGTFNKTDK — WQHPVKKGVKMVKKWDLLPD TASMDQVYF ILKF 203 

233 SGAAALEMMSQAMIR-GM MDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYD 288 

|:|:|: : : |: :: I ::::: : : : : |: II II :: 

204 MGSASLDTKEKKSLNTGIFRPVELEEDEWISMYATDHKDSAILENELEKGMDEMDDDSHE 263 

289 YKIAREYNWNVKNKASKGYEENYFFIFREGDGV-YYNELETRVRLSKRRAKAGVQSG 34 4 

M I : : : : I II I : I I : I I I : : : I : I I : : 

2 64 GKI YKFKRIRDYDMKQVAEKPMTE-LAIRLNDKDGIAYYKPLRSKIELRRRRVNDI IKP- 321 

34 5 TNALLVVKH RDMNEKELEAQEARKAQLEN HEPEEEEEEEMETEEK 38 9 

||:| I: : II :: : : : | ::|:||: I :| 

322 LVKEHDIDQLNVTLRNPSTKEANIRDKLRMKFDPINFATVDEEDDEDEEQPEDVKK 377 

390 EAGGSDEEQEKGSSSEKEGSEDEHSGSESEREEGDRDEASDKSGSGEDESSEDEARAARD 44 9 

1:1 : : : I I II : I I III: : : I : : I I I I I 

378 ESEG--DSKTEGSEQEGENEKDEEIKQEKENEQ DEENKQDENRAADT 4 22 

4 50 KEEIFGSDADSEDDADSDDEDRGQAQ 475 

I III : : : : : I : 

423 PET SDAVHTEQKPEEEKETLQEE 445 



RESULT 5; 
PN0009 

neurofilament triplet M protein - Pacific electric ray (fragment) 
Cj; Species: Torpedo calif ornica (Pacific electric ray) 

CJDate: 17-Jul-1992 #sequence_revision 17-Jul-1992 #text_change 09-Jul-2004 

Cj Accession: PN0009 

R/Linial, M. ; Scheller, R.H. 

J: Neurochem. 54, 762-770, 1990 

A;Title:A unique neurofilament from Torpedo electric lobe: sequence, 
expression, and localization analysis. 

A;Reference number: PN0009; MUID : 90155300 ; PMID:2106008 
A; Accession: PN0009 
A; Molecule type: mRNA 



A; Residues: 1-784 <LIN> 

A; Cross-references: UNIPROT : Q7LZ90; UNIPARC : UPI00001774FE 

C; Comment: Neurofilaments are a subgroup of intermediate filaments which are 
expressed specifically in neuronal cells. 
C;Superfamily: cytoskeletal keratin 

C; Keywords: coiled coil; cytoskeleton; intermediate filament; nerve; 

phosphoprotein; tandem repeat 

F; 1-52/Region: serine-rich 

F; 53-84 /Region : coil la 

F; 98-194 /Region : coil lb 

F^ 2 17-3 67 /Region : coil II 

F; 4 00-5 97 /Region : glutamic acid-rich 

F; 598-674 /Region : 6-residue repeats 

F; 67 5-784 /Domain : carboxyl-terminal #status predicted <CTD> 

Fj 616, 622, 628, 634, 640, 646, 652, 658, 670/Binding site: phosphate (Ser) (covalent) 
#status predicted 

\ Query Match 8.9%; Score 245.5; DB 2; Length 784; 

iBest Local Similarity 20.5%; Pred. No. 1.5e-05; 

|Matches 122; Conservative 102; Mismatches 229; Indels 143; Gaps 23; 

Qy 47 KFITYPFDQNRFVQYKATSLEKQ HKHDLLTEPDLG VTIDLINPDTYR 93 

. : I I I : : : : : I I : I : : : II : | : : I : : 

Db 62 RFAGY-IDKVHYLEQQNKELEAEIQAHRQKQVSHGQLGDVYDQEIRELRSIEQVNQEKAQ 120 

Qy 94 ID-PNVLLD PADEKLLEEE IQAPTSSKRSQQHAKVV 128 

i 1:111 I I : I : I III I : : I I 

Db ! 121 IQLDSVHLDDDFQRVGAFDEEALRDEDPEATIRVLKKETEDSVIQAGDGEKKAQSLQDEV 180 

Qy I 129 PWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIEKTFEDAQK 188 

: : I : I : I I : : I I : : I : I II : : : 

Db ; 181 AFLR NNHEEEV-ADLFAQIQATQVTVEK-KDFLKTDITSALKEIRS 224 

\ 189 SISQHYSKPRVTPVEVMPVFPDFKMW INPCAQVIFDSDPAPKDTSGAAALEMM 241 

: | : I | I : I I : : I : I : : I : : 

Db ; 225 QLEGHSAKNMQQADE WFKCRYDKLNEAAEMNKDAIRAAREEIGEYRRQLQ 274 

Q y ; 242 SQAM IRGM MDEEGNQFVAYFLPVEETLKKRKRDQEEEM DYA 282 

i : : : : | : : : I I I : : I : I : I I : I 

Db ! 275 SKSIELESVRSTKESLERQLTDIEDRHNADVANYQETVQQLENELRGTKWEMARHLREY- 333 

Qy : 283 PDDVYDYKIAREYNWNVKNKASKGYEENYFFIFREGDGVYYN ELETRVRLSKRRA 337 

I : : I : I : I : I I I I I : I I : I : I 

Dfe ; 334 -QDLLNVKMALDIEIAAYRKLLEGEESRYTF-SGTGPSIPYRSPSRPRLPAKVHKTKEVP 391 

Q$ ! 338 KAGVQSGTNALLVVKHRDMNEK ELEAQEARKAQLENHEPEEEEEEEMETEE 388 

IN : : : : : I I : : : I I : I I : : I I I II 

: Db : 392 KVKVQHKFVEEIIEETKVKDEKAEMGDIDLAEAVEGGATMESPEDKEEAEKVVEEAIVAT 451 

Qy \ 389 KEAGGSDEEQEKGSSSEKEGSEDEHSGSESEREEGDRDEASDKSGSGEDESSED 442 

^ I | | | : : I I I I : I | | : : : : | I : I I I I : 

Db ^4 52 VKAGVQAEPRGEAEEESEAKEEEDEGVEEEEEKKE-EADDEEKGEEKDEEGEAEDEAEGG 510 

: 4 43 EARAARDKEEIF GSDADSEDDADSDDEDRGQAQGGSDNDSDSGSNGGGQR- 4 92 

1:1 : | | | II I : : : : I : I : I I I I : I I : I I 

DId : 511 ESRVVEEKVEIVKVEQSKAHPGKDEVKEERKEEEEEEEGEASGESDKESTGGAINGSQEE 570 



i 



Qy 4 93 SRSHSRSASPFPSGSEHSAQED GSEAAASDSSEADSDS 530 

: : II : I : I I I : I : : : : : 

Db 571 SKGKVEEKLTVEKTEKATEDKVSPREEKPQKEEQKDIEEKKEEAKSKDEAKSKDEA 626 

RESULT 6: 
T42963 

hypothetical protein 48 - ateline herpesvirus 3 (strain 73) 
C; Species: ateline herpesvirus 3 
A/Variety: strain 73 

C;Date: 21-Jan-2000 #sequence_revision 21-Jan-2000 #text_change 09-Jul-2004 

C, : Accession: T42963 

R;Albrecht, J.C.; Fleckenstein, B. 

submitted to the EMBL Data Library, August 1998 

A; Description: Primary structure of the herpesvirus ateles genome. 
A;Reference number: Z22274 
A;Accession: T42963 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A, : Residues: 1-792 <ALB> 

A;Cross-references: UNIPROT : Q9YTL7 ; UNIPARC : UPI00000EC1E3; EMBL : AF0834 24 ; 
PIDN:AAC95573. 1 

A; Experimental source: strain 73 

> Query Match 8.7%; Score 241; DB 2; Length 7 92; 

: Best Local Similarity 21.2%; Pred. No. 2.6e-05; 

:Matche$ 94; Conservative 69; Mismatches 173; Indels 108; Gaps 14; 



Qy 


! 169 


YKDRDSQITAIEKTFEDAQKSI SQHYSKPRVTPVEVMPVFP 


209 


Db 


: 213 


1 : 1 : 1 1 1 : : : 1 1 I : : : : 1 : 
YQYMSSDLIAIEEALQSSYLSICGSTYPSYSKILELLTANMSKEHIRQKVNVTD 


266 


i 

Qy 


\ 210 


DFKMWINPCA-QVIFDSDPAPKDTSGAAALEMMSQAMIRGM MDEEGNQ 


256 


i 

Db 


• 267 


: 1 1 1 : 1 1 : 1 : : : : : 1 1 : : 
FIKPSLHQMIRDTKKEPRQKTKTLMISILGS RGIGLDLFRTQDVLKFPSSDAK 


319 


Q^ 


i 257 


FVAYFLP--- VEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVKNKASK 

I : 1 I II:: III: III : I : I I : I : 1 
FMAVSQPDNFNEKEVEFSMTGGKTDSEDVT--APRKVGKNSLNRKYLENLKDNKRKNNNY 


305 


Db 

} 


j 320 


377 


Qy 


! 306 


-GYEENYFFI FREGDG VYYNELE TRVRLSKRRAKAGVQSGTNAL — 

1 I : 1 1 1 : 1 1 1 1 : 1 1 : 1 : 1 : 
SGRNNKY KGDGANDKDKSIDKNESEGGDHSEINREKNRKRKKPNGFRVGDKEVGE 


348 


Db 


' 378 


432 


Qy 


: 349 


LVVKHRDMNEKELEAQEARKAQLENHEPEEEEEEEMETEEK 

I : I : I : : : 1 : : 1 1 1 1 1 1 1 1 1 1 1 1 : 
EKSVKSGEGKKSEKDSEEEAEDKDEEENKKKGDGEEDEEDEEEEDEEEEEEEEEEEDEEE 


389 


Db 


; 433 


492 


Qy 


; 390 


EAGGSDEEQEKGSSSE — KEGSEDEHSGSESEREEGDRDEASDKSGSGEDESSEDEARAA 

1 Mil: 1 : 1 III : 1 1 1 : II : : 1 : : : : 1 
EEEEEDEEDEEEEEDEEDEEDEEDEEDEEDEEDEEDEEDEEEEEDEEEEEDEEDEEEEEE 


447 


Db 


1 493 


552 


Qy 


448 


RDKEEIFGSDADSEDDADSDDEDRGQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGS 

: : I 1 : | | | : : : | | : : : : : : 1 : : 1 
EEEEEEEEDEEDEEDEEEKEDEEEKEDEEEKEDEED EEEKEDDEDEEEEEEGE 


507 


Db 


: 553 


605 


Qy 


: 508 


EHSAQEDGSEAAASDSSEADSDSD 531 





I III : I : I : 

Db : 606 EKEDDEDDEEEEDEEDDEEEEDEE 629 



RESULT 7 
Af0437 

glutamic: acid-rich protein, retinal - bovine 
C; Species: Bos primigenius taurus (cattle) 

C;Date: 14"Feb-1992 #sequence_revision 14-Feb-1992 #text_change 09-Jul-2004 
C; Accession : A4 04 37 

R-Sugimoto, Y . ; Yatsunami, K.; Tsujimoto, M . ; Khorana, H.G.; Ichikawa, A. 
Proc. Natl. Acad. Sci . U.S.A. 88, 3116-3119, 1991 

A;Title:;The amino acid sequence of a glutamic acid-rich protein from bovine 

retina as deduced from the cDNA sequence. 

A;Reference number: A40437; MUID : 91195303; PMID:2014230 

A/Accession: A40437 

A; Status: preliminary 

A; Molecule type: mRNA 

A; Residues: 1-590 <SUG> 

A, ; Cross-references: UNIPROT : Q28181; UNIPARC : UPI00001 6C311 ; GB:M61185; 
NiD:gl63077; PIDN : AAA30536 . 1 ; PID:gl63078 

Query Match 8.5%; Score 236; DB 2; Length 590; 

Best Local Similarity 22.4%; Pred. No. 3.3e-05; 

Matches 133; Conservative 80; Mismatches 221; Indels 160; Gaps 28; 

Q</ 21 RTLPERSGVVCRVKY— CNSLPDI PFDPKFITYPFDQNRFVQYKATSLEKQHKHDLLTEP 7 8 

111:1 : I : I : : I I I I : : : I I : II 
Db 7 RVLPQPPGTPQKTKQEEEGTEPEPELEPKPETAPEE TELEEVSLPPE EP 55 

Qy 79 DLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTS SKRSQQHAKVVPWMRK-- 133 

! : I : : II : I I I I I : I : I I 

Db 56 CVGKEVAAVTLGPQGTQETALTPPT SLQAQVSVAPEAHSSPRGWVLTWLRKGV 108 

Q^ \ 134 TEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 180 

' : I | | : : | : I I I : : I 

Db ; 109 EKVVPQPAHSSRPSQNIAAGLESPDQQAGAQILGQCGTGG--SDEPSEPSRAEDPGPGPW 166 

Qy : 181 --KTFE-DAQKSISQHYSKPRVT PVEVMPVF 208 

• : | | : : I : I I : : : : I : I : 

Db : 167 LLRWFEQNLEKMLPQ PPKISEGWRDEPTDAALGPEPPGPALEIKPMLQAQESPSLPA 223 

Q^ i 209 PDFKMWINPCAQVIFDSDPAPKDTSGAAA LEM-MSQAMIRGMMDEEGNQF 257 

i I : : I : I I I : I : : I I I I : I : I I I I : : 

Do : 224 PGPPEPEEEPI PEPQPTIQASSLPPPQDSARLMAWILHRLEMALPQPVIRGKGGEQESD- 282 

Qy ; 258 VAYFLPVE ETLKKRKRDQEEEMDYAPDDVYDYKIAREY--NWNVKNKASKGYEENYF 312 

II : I : : I I I : I I : I 
Db ; 283 APVTCDVQTISILPGEQEE SHLILEEVDPHW 313 

Qy I 313 FIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKH RDMNEKELEAQEA 366 

II I I I : I I : I II: I I I I I : I 

Db : 314 EEDEHQEGSTSTSPRTSE-AAPADEEKGE VVEQTPRELPRIQEEKEDEEEEK 364 



Db 



367 RKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSE-DEHSGSESEREEGDR 425 

: I I | : | | I I I : I : I I : : I : I : I I : I I I : I I I I I I I 
365 EDGEEEEEEGREKEEEEGEEKEEEE-GREKEEEEGEKKEEEGREKEEEEGGEKEDEEGRE 423 



Qy • 426 DEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSD DEDRG 472 

I : I 1:1 II | | | | : | :::::: I : [ I : 

Db "4 24 KEEEEGRGKEEEEGGEKEEEEGRGKEEVEGREEEEDEEEEQDHSVLLDSYLVPQSEEDQS 4 83 

Qy ! 473 QAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSE-AAASDSSE 525 

: : : : : III::: I I I I : I I I I I I I 

Db : 4 84 E ESETQDQSEVGGAQTQGEVGGAQAL SEESETQDQSEVGGAQDQSE 529 



RESULT 8: 
C89824 

hypothetical protein sdrC [imported] - Staphylococcus aureus (strain N315) 
C; Species: Staphylococcus aureus 

C^Date: io-May-2001 #sequence_revision 10-May-2001 #text_change 09-Jul-2004 
C;Accession: C89824 

R;Kuroda> M. ; Ohta, T.; Uchiyama, I.; Baba, T. ; Yuzawa, H.; Kobayashi, I.; Cui, 
Li; Oguchi, A.; Aoki, K. ; Nagai, Y.; Lian, J.; Ito, T.; Kanamori, M. ; Matsumaru, 
Hi; Maruyama, A.; Murakami, H.; Hosoyama, A.; Mizutani-Ui, Y.; Kobayashi, N . ; 
Sawano, T.; Inoue, R.; Kaito, C; Sekimizu, K. ; Hirakawa, H.; Kuhara, S.; Goto, 
Si; Yabuzaki, J.; Kanehisa, M. ; Yamashita, A.; Oshima, K. ; Furuya, K.; Yoshino, 
C;; Shiba, T.; Hattori, M. ; Ogasawara, N . ; Hayashi, H.; Hiramatsu, K. 
Lancet 357, 1225-1240, 2001 

A; Title :; Whole genome sequencing of meticillin-resistant Stapylococcus aureus. 

A; Reference number: A89758; MUID: 21311952 ; PMID: 114 1814 6 

A; Accession: C89824 

A; Status : preliminary 

A; Molecule type: DNA 

A; Residues: 1-953 <KUR> 

A;Cross-references: UNIPROT : Q99W4 8 ; UNIPARC : UPI00000CAB80 ; GB:BA000018; 
PID:gl3700453; PIDN : BAB4 1750 . 1 ; GSPDB : GN0014 9 
A; Experimental source: strain N315 
C;Genetics : 

A; Gene: sdrC 

■• 

: : Query Match 8.5%; Score 234; DB 2; Length 953; 

iBest Local Similarity 23.0%; Pred. No. 7.4e-05; 

•;Matchek 131; Conservative 81; Mismatches 216; Indels 142; Gaps 28; 

Qy 35 YCNSLPDIPFDP KFITYPF-DQNRFVQYKATSLEK QHKHDLLTEPD-LGVTID 85 

: : : I I : I II I I I : I I I : I : : I hi 

Db • 37 5 FVTNLTGYKFNPDAKNFKIYEVTDQNQFVDSFTPDTSKLKDVTGQFDVIYSNDNKTATVD 4 34 

Q^ -86 LINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKRSQQHAKVVPWMRKTEY-ISTEFNRY 14 4 

^ | : I : : I : : : : : : I : I I : I : : I : I : : 

Db ! 435 LLNGQS SSDKQYI IQQVAYPDNS--STDNGKI DYTLETQNGKS 475 

Qj/ : 14 5 GISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIEKTFEDAQKSISQHYSKPRVTPVEV 204 

i II III ::: I I : I I I I :: : I I 

Db 47 6 SWSNSYSNVN-GSSTAN — GDQKKYNLGD YVWEDTNKDGKQDANEKGIKGVYV 525 

Q^ : 205 MPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEMMSQAMIRGMMDEEGN-QF 257 

i I : I I : : I :: I I I I I I 

Db 526 I LKDSNG KELDRTTTDENGKYQFTGLSNG 554 

Q^ : 258 VAYFLPVEETLKKRKRDQEEEMDY APDDVYD YKIAR EYN 296 

i I : I I : : : I I : I I I : : I 



Db 


: 555 


Qy 


' 297 


Db 


615 


Qy 


: 331 


Db 


! 670 


Qy 


| 387 


Db 


. 727 


wy 


: i t j 


Db 


: 787 


Qy 


. 503 


Db 


! 846 



555 TYSVEFSTPAGYTPTTANAGTDDAVDSDGLTTTGVIKDADNMTLDSGFYKTPKYSLGDYV 614 



I I 



-ASKGY- 
I I 



II: 1:111 



-EENYFFI FREGDGVYYNELETRV 330 

: I I : I I I : I 

PDENGKYRFDNLDSGKY KV 669 



I I I 



I 11:1 



I I 



I I : I I : hill 



I I 



I I I II II I : 



I I I I 



I I 



III I :: I I II I 



RESULT 9; 
D89824 

hypothetical protein sdrD [imported] - Staphylococcus aureus (strain N315) 
C; Species: Staphylococcus aureus 

C;Date: 10-May-2001 #sequence_revision 10-May-2001 #text_change 09-Jul-2004 
C; Accession: D89824 

R;Kuroda, M. ; Ohta, T.; Uchiyama, I.; Baba, T.; Yuzawa, H.; Kobayashi, I.; Cui, 
Li; Oguchi, A.; Aoki, K. ; Nagai, Y.; Lian, J.; Ito, T.; Kanamori, M . ; Matsumaru, 
Hi; Maruyama, A.; Murakami, H.; Hosoyama, A.; Mizutani-Ui, Y.; Kobayashi, N . ; 
Sawano, T . ; Inoue, R. ; Kaito, C.; Sekimizu, K. ; Hirakawa, H.; Kuhara, S.; Goto, 
Si; Yabuzaki, J.; Kanehisa, M . ; Yamashita, A.; Oshima, K. ; Furuya, K. ; Yoshino, 
Ci; Shiba, T.; Hattori, M. ; Ogasawara, N.; Hayashi, H . ; Hiramatsu, K. 
Lancet 357, 1225-1240, 2001 

A; Title :; Whole genome sequencing of meticillin-resistant Stapylococcus aureus. 

A;Reference number: A89758; MOID: 21311952; PMID: 114 1814 6 

A; Accession: D8 9824 

A; Status : preliminary 

A; Molecule type: DNA 

A; Residues: 1-1385 <KUR> 

A) Cross-references: UNIPROT:Q99W47; UNIPARC : UPI00000CAA1F; GB:BA000018; 
PiD:gl3700454; PIDN : BAB4 1751 . 1 ; GSPDB : GN0014 9 
A; Experimental source: strain N315 
C;Genetics : 
A; Gene: sdrD 



} Query Match 8.3%; Score 230.5; DB 2; 

iBest Lbcal Similarity 20.4%; Pred. No. 0.00017; 
'Matches 116; Conservative 89; Mismatches 236; 



Length 1385; 
Indels 127; 



Gaps 20; 



Qy 
Db 



27 SGVVCRVKYCNSLPDIPFDPKFITYPFDQNRFVQYKATSLEKQHKHDLLTEPDLGVTIDL 86 
: I I : I : : | | I : I III : I I I : I 

772 TGVI NGADNMTLDSGF— YKTPKYNLGNYVWEDTNKDGKQDSTEKGISGVTVTL 823 



87 INPD 



TYRID PNVLLDPADEKLLEEEIQAP 115 



I : II::: I : II :: : 

Db ; 824 KNENGEVLQTTKTDKDGKYQFTGLENGTYKVEFETPSGYTPTQVGSGTDEG-IDSNGTST 882 

Qy ; 116 TSSKRSQQHAKV VPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYK 170 

I : : : : I : I : : I : I : : : : I I : I I 
Db ; 883 TGVIKDKDNDTIDSGFYKPTYNLGDYVWEDTNKNGVQDKDEKGISGVTV TLK 934 

Q^ I 171 DRDSQI TAI EKT FEDAQKS I SQ HYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSD 226 

; I : : : I I : : : I III : : 

Db ! 935 DENDKVLKTVTTDENGKYQFTDLNNGTYKVEFETPSGYTPT SVTSGN 981 

Qy j 227 PAPKDTSGAAALEMMSQAMI RGMMDEEG NQFVAYFLPVEETLKKRKRDQEE 277 

I I : : I : : I I : I : I I : : I I : I I 

Db j 982 DTEKDSNGLTTTGVIKDA--DNMTLDSGFYKTPKYSLGDYVWY DSNKDGKQDSTE 1034 

\ \ 

Qy : 278 EMDYAPDDVYDYKI AREYNWNVKNKASKGYEENYFFI FREGDGVYYNEL 326 

i : : I I : : : I I : I I I : I : 

Db 1035 K GIKDVKVILLNEKGEVIGTTKTDENGKYRFDNLDSGKYKVIFEKPTGL 1083 

Qy : 327 ETRVRLSKRRAKAGVQSGTNALLVVKHRDMNEKELEAQEARKAQLEN-HEPEEEEEEEME 385 

I : I I I III::: | : | : I I : : : 

Db j 1084 TQTGTNTTEDDKDADGGEVDVTITDHDDFTLDNGYYEEETSDSDSD 1129 

Qy ; 38 6 TEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESERE-EGDRDEASDKSGSGEDESSEDEA 44 4 

: : | | : : | | : : | I I : I : : : I | I I I I I I : : : 

Dh ; 1130 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD-SDSDSDSDSDSDS 1188 

Qy : 445 RAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFP 504 

: I : I I : I I : I : I I I : : I I : I I I I I : I I I I I 

Db I 1189 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD-SDSDSDSDSDSD 1247 

Qy ; 505 SGSEHSAQEDGSEAAASDS-SEADSDSD 531 

II: : I : I I I I :: I I I I I 

Db i 1248 SDSDSDSDSDSDSDSDSDSDSDSDSDSD 1275 

RESULT 10 
F90070 

Clumping! factor B [imported] - Staphylococcus aureus (strain N315) 
C; Species: Staphylococcus aureus 

C;Date: 10-May-2001 #sequence_revision 10-May-2001 #text_change 09-Jul-2004 
^Accession: F90070 

R;Kuroda, M . ; Ohta, T.; Uchiyama, I.; Baba, T.; Yuzawa, H.; Kobayashi, I.; Cui, 
Li; Oguchi, A.; Aoki, K. ; Nagai, Y. ; Lian, J.; Ito, T.; Kanamori, M. ; Matsumaru, 
Hi; Maruyama, A.; Murakami, H.; Hosoyama, A.; Mizutani-Ui, Y. ; Kobayashi, N. ; 
Sawano, T . ; Inoue, R. ; Kaito, C; Sekimizu, K. ; Hirakawa, H.; Kuhara, S.; Goto, 
SI; Yabuzaki, J.; Kanehisa, M. ; Yamashita, A.; Oshima, K. ; Furuya, K. ; Yoshino, 
Cl; Shiba, T.; Hattori, M. ; Ogasawara, N . ; Hayashi, H.; Hiramatsu, K. 
Lancet 357, 1225-1240, 2001 

A) Title :: Whole genome sequencing of meticillin-resistant Stapylococcus aureus. 

A; Reference number: A89758; MUID: 21311952 ; PMID: 1141814 6 

A, ; Accession: F90070 

A; Status; preliminary 

A; Molecule type: DNA 

A; Residues: 1-877 <KUR> 

A) Cross-references: UNIPROT : Q99R07 ; UNIPARC : UPI00000CADCA; GB:BA000018; 
PID:gl3702588; PIDN : BAB4 3728 . 1 ; GSPDB : GN0014 9 



A; Experimental source: strain N315 
C; Genetics : 
A; Gene: clfB 



] Query Match 8.3%; Score 230; DB 2; Length 877; 

; Best Local Similarity 22.6%; Pred. No. 0.00011; 

iMatchek 125; Conservative 76; Mismatches 207; Indels 146; Gaps 23; 

Qy 33 VKYCNSLPDIPF-DPKFITYPFDQNRFVQYKATSLEKQHKHDLLTEPDLGVTIDLIN 88 

i I I II : I I I I I I II : I : I I : I I : I 

Db ^271 VDYSNSNNTMPIADIK STNGDVVAKAT YDILTKTYTFVFTDYVNNKE 317 

Qy 89 PDTYRIDPNVLLDPADEKLLEE EIQAPTSSKRS 121 

I : I I : : I I I : I I : I 

Db ! 318 NINGQFSLPLFTDRAKAPKSGTYDANINI — ADEMFNNKITYNYSSPIAGIDKPNGANIS 375 

Qy \ 122 QQHAKVVPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE- 180 

I I : I I I III | : : : : | : : : : : | : 

Db ! 376 SQI IGVDTASGQNT YKQTVF VNPKQRVLGNTWVYIKGYQDKI-EESSGKVSATDT 429 

Q^ j 181 --KTFE — DAQKSISQHYSKPRVTPV-EVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGA 235 

: I I I I : | : | : : I I I : : : I I I 
Db : 430 KLRIFEVNDTSKLSDSYYADPNDSNLKEVTDQFKNRIYYEHPNVASIKFGD 480 

Qy i 236 AALEMMSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREY 295 

: : : I I I I : I : : I III: 
Db : 481 --ITKTYVVLVEGHYDNTG KNLKTQVIQENVDPVTNRDYS I F 520 

Qy j 296 NWNVKNKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLVVKHRD 355 

11:1 : I I 11:1 
Db j 521 GWNNEN VVRYG GGSADGDSA 540 

Qy ; 356 MNEKE LEAQEARKAQLE NHEPEEEEEEEMETEEKEAGGSDEEQEK 400 

: I I : : : : : : I : I I I : : : : : III: 

Db j 541 VNPKDPTPGPPVDPEPSPDPEPEPTPDPEPSPDPEPEPSPDPDPDSDSDSDSGSDSDSGS 600 

Qy I 401 GSSSEKEGSEDEHSGSESERE-EGDRDEASDK-SGSGEDESSEDEARAARDKEEIFGSDA 458 

Ml: I I I : I : : I I I I I I I I I : :: : I : II: 
D$ : 601 DSDSESDSDSDSDSDSDSDSDSESDSDSESDSDSDSDSDSDSDSDSESDSDSDSDSDSDS 660 

Qy ; 459 DSEDDADSDDEDRGQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEA 518 

I I : : : I I I I : : I I : I I I I I : I I I I I II: : : I 

Db : 661 DSDSESDSDSESDSESDSDSDSDSDSDSDSDSD-SDSDSDSDSDSDSDSDSDSESDSDSD 719 

Qy : 519 AASDS-SEADSDSD 531 

: I I I I :: I I I I I 
Db = 720 SDSDSDSDSDSDSD 733 



RESULT 11 
S41539 

f ibrinogen-binding protein - Staphylococcus aureus 
N; Alternate names: clumping factor 
C; Species: Staphylococcus aureus 

CjDate: 13-Jan-1995 #sequence_revision 13-Jan-1995 #text_change 09-Jul-2004 
C;Accession: S41539; S36630 

R;McDevitt, D. ; Francois, P.; Vaudaux, P.; Foster, T.J. 



Mol. Microbiol. 11, 237-248, 1994 

A; Title: Molecular characterization of the clumping factor (fibrinogen receptor) 
of Staphylococcus aureus. 

A/Reference number: S41539; MUID: 94224 142; PMID:8170386 
A; Accession : S41539 
A; Status: preliminary 
A; Molecule type: DNA 
A, : Residues: 1-933 <MCD> 

A;Cross-references: UNIPROT : Q53653; UNIPARC : UPIOOO0OBB5DF; EMBL:Z18852; 
NID:g397525; PIDN : CAA79304 . 1; PID:g397526 

■; Query Match 8.2%; Score 225.5; DB 2; Length 933; 

;Best Lbcal Similarity 21.4%; Pred. No. 0.0002; 

•Matches 103; Conservative 76; Mismatches 187; Indels 115; Gaps 18; 

Qy 83 TIDLINP--DTYRIDPNVLLDPADEKLLEEEIQAPTSSKRSQQHAKVVPWMRKTEYISTE 140 

III I : : I I I : : : | : : : : I I : 
Db : 383 TIDQIDKTNNTYR--QTI YVNPSGDNVI APVLT 413 

Qy : 141 FNRYGISNEKPEVKIGVSVKQQFTEEEIYK DRDSQITAIEKTFEDAQKSISQHYS 195 

III : I I I : : I I I : I I I I : : : 

Db -414 GNLKPNTDSNALIDQQNTSIKVYKVDNAADLSESYFVNPENFEDVTNSVNITFP 4 67 

Qy • 196 KPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEMM SQAMIRGM- 249 

I II II:: I II : : I I I I : I I 

4 68 NPNQYKVEFNT--PDDQITTPYIVVVNGHIDP NSKGDLALRSTLYGYNSNI IWRSMS 522 



Db 

Qy 



250 MDEEGNQFVAYFL PVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVKNKA 303 

II ||: : : : : : I : I : : I : I : : 

Db = 523 WDNE VAFNNGSGSGDGIDKPVVPEQPDEPGEIEPIPED SDS 563 

Q% • 304 SKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALL VVK 352 

I : I : : I I I : : : 

Do 1 564 DPGSDSG SDSNSDSGSDSGSDSTSDSGSDSASDSDSAS 601 

Qy I 353 HRDMNEKELEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDE 412 

I I : : I ::::::::: I I : : I I : : I 

Db i 602 DS DSAS DS DSAS DS DSAS DS DS DNDS DS DS DS DS DS DS DS DS DS DS DS DS DS DS DS DS DS 661 

Qy i 413 HSGSESERE-EGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDR 471 

I I : I : : : I I II I I I I : : : : I : I I : I I : I : I I I : 

Db ; 662 DSDSDSDSDSDSDSDSDSD-SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 720 

Qy ! 4 72 GQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDS-SEADSDS 530 

: II : I I I I I : I I I I I I I : : I : I I I I I : : I I I I 

Db : 721 SDSDSDSDSDSDSDSDSDSD-SDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSDS 779 

Q^ ; 531 D 531 

I 

Db . 780 D 780 



RESULT 12 
A36811 • 

hypothetical protein ORF48 - saimiriine herpesvirus 1 (strain 11) 
C; Species: saimiriine herpesvirus 1 

A; Note: host Saimiri sciureus (common squirrel monkey) 



C;Date: 16-Oct-1992 #sequence_revision 16-Oct-1992 #text_change 08-Oct-1999 
C; Accession: A3 68 11 
R;Albrecht, J. 

submitted to the EMBL Data Library, January 1992 

A; Description: Primary structure of the herpesvirus saimiri genome. 
A; Reference number: A36806 
A; Accession : A3 68 11 
A; Molecule type: DNA 
A; Residues: 1-797 <ALB> 

A/Cross-references: UNIPARC:UPI00001385A1; GB:X64346; NID:g60320; 
PtDN:CAA4 5671. 1; PID:g60369 

R;Albrecht, J.C.; Nicholas, J.; Biller, D.; Cameron, K.R.; Biesinger, B.; 
Newman, C; Wittmann, S.; Craxton, M.A.; Coleman, H.; Fleckenstein, B.; Honess, 

r:w. 

J; Virol, 66, 5047-5058, 1992 

Ai Title :; Primary structure of the herpesvirus saimiri genome. 

A; Reference number: A37309; MUID: 92333688 ; PMID: 1321287 

A; Contents: annotation; protein-coding frames 

A; Note: neither protein nor nucleotide sequence is given 

C;Genetibs : 

A; Gene: 48 

:Query Match 8.1%; Score 224; DB 2; Length 797; 

•Best Local Similarity 23.3%; Pred. No. 0.0002; 

^Matchek 91; Conservative 56; Mismatches 181; Indels 62; Gaps 14; 





| 157 


VSVKQQFTEEEI YKDRDSQITAIEKTFEDAQKSISQHYSKPRVTPVEVMPVF 

I I : I I : I : 1 : : : 1 : 1 : : : : 1 1 : 


208 


Dfe 


1 330 


VSEYEDFDEDEVELCISDDEVDSEDGNLCVL DDESESVNS-VALRQVLTVDKQANE 


384 


Qy 


\ 209 


PDFKMWINPCAQVIFDSDPAPKDTSGAAALEMMSQAMIRGMMDEEGNQFVAYFLPVEETL 

::| 1: 1 1 II |: :: 1 ::|| II 
KEYKKIIDKSD DRDDRDKD EYELENEEYNRDEEEDEGED EEDE 


268 


D6 


\ 385 


427 


Qy 


; 269 


KKRKRDQEEEMDYAPDDVYDYKIAREYNWNVKNKASKGYEENYFFIFREGDGVYYNELET 
I | : |:| I |: 1 ::: :| 1 : 1 :| :: 1 


328 


Db 


I 428 


KDEKEEGEDEGDDGEDEGED EGEDEGDEGDEGDE GEDEGEDEDDEED 


474 


Qy 


329 


RVRLSKRRAKAGVQSGTNALLVVKHRDMNEKELEAQEARKAQLENHEPEEEEEEEMETEE 


388 


Db 


475 


1 | : 1 1 1 1 : : 1 1 : 1 : : 1 1 : 1 
EGEDEGDEGDEGEDEGDEG DEGEDEGDEGDEGKDEGDEGDEGKDEGDEGDE 


525 


Q t 


; 38 9 


KEAG — GSDEEQEKGSSSEKEGSEDEHSGSESERE EGD--RDEASDKSGSGEDESSE 

: 1 1 1 1 : : : 1 1 1 1 1 1 1 1 1 1 III II 1 : 1 1 1 1 1 
GDEGDEGEDEWEDEGDEGEDEGDEGEDEGDEGEDEGEDEGDEGEDEGEDEGDEGEDE-GE 


441 


Dtb 


; 526 


584 


Qy 


i 442 


DEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDNDSDSGSNGGGQRSRSHSRSAS 


501 


Db 


\ 585 


II |:| I : : I : : 1 : 1 1 : : 1 1 II 1 : 
DEGDEGEDEGEDEGDEGEDEGEDEGDEGDEGEDEG — DEGEDEGDEGEDEGDEGEDEGDE 


642 


Qy 


; 502 


PFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 




Db 


643 


II : 1 1 : I : : 1 
GEDEGDEGEDEGDEGEDEGDEGDEGEDEGD 672 





RESULT 13 
T28679 



f ibrinogen-binding protein homolog - Staphylococcus aureus 
C; Species: Staphylococcus aureus 

C;Date: 15-Oct-1999 #sequence_revision 15-Oct-1999 #text_change 09-Jul-2004 
C;Accession: T28679 

R/Josefsson, E. ; McCrea, K . ; Ni Eidhin, D. ; O'Connell, D.; Cox, J.; Hook, M. ; 
Foster, T.J. 

Microbiology 144, 3387-3395, 1998 

A; Title: Three new members of the serine-aspartate repeat protein multigene 
family of Staphylococcus aureus. 

A; Reference number: Z20510; MUID: 99098700; PMID: 9884231 
A; Accession: T28 679 

A; Status: preliminary; translated from GB/EMBL/DDBJ 

A, : Molecule type: DNA 

A; Residues: 1-1315 <JOS> 

A, : Cross-teferences: UNIPROT : 0864 88 ; UNIPARC : UPI0000052285 ; EMBL: AJ005646; 
NtD:el318791; PID : el318792 ; PIDN : CAA06651 . 1 
C; Genetics : 
A} Gene: sdrD 

:Query Match 8.1%; Score 223.5; DB 2; Length 1315; 

^Best Local Similarity 21.7%; Pred. No. 0.00038; 

'.Matches 121; Conservative 85; Mismatches 225; Indels 127; Gaps 23; 

Qy 27 SGVVCRVKYCNSLPDIPFDPKFITYPFDQNRFVQYKATSLEKQHKHDLLTEPDLGVTIDL 86 

! 1 : I I : | : : | I | : I III: III: I 

Db : 772 TGVI NGADNMTLDSGF — YKTPKYNLGNYVWEDTNKDGKQDSTEKGISGVTVTL 823 

Q</ 87 INPD TYRID PNVLLDPADEKLLEEEIQAP 115 

! | : II::: I : I I : : : 

Db • 824 KNENGEVLQTTKTDKDGKYQFTGLENGTYKVEFETPSGYTPTQVGSGTDEG-IDSNGTST 882 

i 

Qy 116 TSSKRSQQHAKV VPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYK 170 

1 I : : : : I : I : : I : I : : : : I I : I I 

Db : 883 TGVIKDKDNDTIDSGFYKPTYNLGDYVWEDTNKNGVQDKDEKGISGVTV TLK 934 

Q^ -171 DRDSQITAIEKTFEDAQKS I SQ HYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSD 226 

? I : : : I I : : : I III : : 

Db ! 935 DENDKVLKTVTTDENGKYQFTDLNNGTYKVEFETPSGYTPT SVTSGN 981 

Qy ; 227 PAPKDTSGAAALEMMSQAMIRGMMDEEG NQFVAYFLPVEETLKKRKRDQEE 277 

i I I : : I : : I I : I : I I : : I I : I I 

Db ■ 982 DTEKDSNGLTTTGVIKDA--DNMTLDSGFYKTPKYSLGDYVWY DSNKDGKQDSTE 1034 

Q</ \ 278 EMDYAPDDVYDYKIAREYNWNVKNK--ASKGYEENYFFIFREGDGVYYNELETRVRLSKR 335 

: : I I : I I : : : I I : I I I : I I 

Db : 1035 K GIKDVKVTL LNEKGEVIGTTKTDENGKYCFDNLDSGKY KVIFEK- 1079 

Q^ : 336 RAKAGV-QSGTNALLVVKHRDMNEKELEAQEARKAQLENHEPEEEEEEEMETEEKEAGGS 394 

I I : I : I I I III:: : I : I III: I 
Db ; 1080 --PAGLTQTGTNTTEDDKDADGGEVDVTITDHDDFTLDNGYYEEETSD S 112 6 

Qy ; 395 DEEQEKGSSSEKEGSEDEHSGSESEREEGDRDEASDKSGSGEDESSEDEARAARDKEEIF 4 54 

I : : II::: I I I : I : : I I I I I I I I : : : : I : 
Db ■ 1127 DSDSDSDSDSDRDSDSDSDSDSDSD-SDSDSDSDSD-SDSDSDRDSDSDSDSDSDSDSDS 1184 



Qy 



4 55 GSDADSEDDADSDDEDRGQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQED 514 
11:11:1:111: : I I : I I I I I : I I I I II: : I 



Db ■ 1185 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD SDSDSDSDSDSDSDSDSDSD 1237 



Qy : 515 GSEAAASDS-SEADSDSD 531 

! : I I I . I : :.! I I I I 

Db . 1238 SDSDSDSDSDSDSDSDSD 1255 



RESULT 14 
A71623 

probable ; secreted protein PFB0115w - malaria parasite (Plasmodium falciparum) 
C; Species: Plasmodium falciparum 

C,;Date: 13-Nov-1998 #sequence_revision 13-Nov-1998 #text_change 09-Jul-2004 
C;Accession: A71623 

R;Gardner, M.J.; Tettelin, H.; Carucci, D.J.; Cummings, L.M.; Aravind, L.; 
Koonin, E.V.; Shallom, S.; Mason, T.; Yu, K. ; Fujii, C.; Pederson, J.; Shen, K. 
Jing, J.; Aston, C; Lai, Z.; Schwartz, D.C.; Pertea, M. ; Salzberg, S.; Zhou, 
Li; Sutton, G.G.; Clayton, R.; White, 0.; Smith, H.O.; Fraser, CM.; Adams, 
M;D.; Venter, J.C.; Hoffman, S.L. 
Science 282, 1126-1132, 1998 

A, ; Title :; Chromosome 2 sequence of the human malaria parasite Plasmodium 
falciparum. 

A; Reference number: A71600; MUID : 9902174 3 ; PMID: 9804551 
A; Accession : A71623 

A; Status: preliminary; nucleic acid sequence not shown; translation not shown 

A; Molecule type: DNA 

A; Residues: 1-1192 <GAR> 

A;Cross-references: UNIPROT : 096127 ; UNIPARC : UPI000007 6661 ; GB:AE001373; 
GB:AE001362; NID: g3845097; PIDN : AAC71813 . 1 ; PID : g384 5099 ; TIGR: PFB0115w 
A; Experimental source: clone 3D7 
C;Genetics : 
A;Gene: PFB0115w 

i Query Match 8.0%; Score 221; DB 2; Length 1192; 

^Best Local Similarity- 20.2%; Pred. No. 0.00045; 

; Matches 106; Conservative 90; Mismatches 196; Indels 132; Gaps 20; 

Qy 55 QNRFVQYKATSLE KQHKHDLLTEPDLGVTI DLINPDTYRI DPNVLLDPADEKLL 108 

! I : : : I : : I : I I I : : I I : I I I III: 
Db ; 296 QSKYKQERIEILDDNGKELKSHKN— IKEEKGGIE KTDTTNI ADIKIK 341 

Qy : 109 EEEIQAPTSSKRS-QQHAKVVPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEE 167 

; : I I : : : : | | | | : : | : : : | | | : : : I I 

Db 1 342 KEERETKDEKEKNIQQLVKDVQLIKVGE ETKDDEKEDKEGTDDEEDTDDEE 392 

Q^ I 168 IYKDR DSQITAIEKTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIF 223 

; " I | : | : | : I I : : : I :: 
Db ; 393 DTDDEEDTDDEEDTSDEETTGDQENKEETEVDEKKTEKAE EELEE 4 37 

Q^ : 224 DSDPAPKDTSGAAALEMMSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAP 283 

I : : I I : : I : I : I : I : : I : I : I : I 

Db : 4 38 DKEESEKDKEESEKDKEESE KDKEES EKDKEKTEEDEEKTEDEKG 4 82 

Q^ ; 284 DDVY DYKIAREYNWNVKNKASKGYEENYFFI FREGDGVYYNELETRVRLSKRR 336 

i : I I : I I I : : I I I : : I I : I I 

Db 1 483 TEVYKKETDVDEKKEKGEYGEGTDDEEDKEKEE DDEETKVEEKKTE 528 



Qy 



337 AKAGVQSGTNALLVVKHRDMNEKELEAQ-EARKAQLENHEPEEEEEE EMETEEKEAG 392 



I I • • I • • I • I • I • I I I I . I . I I • I I • 

Db ; 529 KD EEGTD YEEDTDDSDKDEETKVEEKKTERDEEETEEDEKETEVEKKKTEKDEE 582 

Qy : 393 GSDEEQEKGSSSE KEGSEDEHSGSE SEREEGDRDEASD 4 30 

I : I I : : I : : I I : : I I I : I : I : I : I 

Db 583 GTDYEEDTDDSDKDVETEVEETDAEDKEENEEGTDDEEDKVEETDLDDQEEDGEEDKEDD 642 

Qy f 4 31 KSGSGEDESSEDEARAARDKEEIFGSD ADSEDDADSDDEDRGQAQGGSDNDSDSGSN 4 87 

I I I : : I : : I : I : I I I I I : I I I : : : I : 

Db 64 3 KEKDKEDDKEDDKEKDKEDDKEKYKEDDKEDDKEDDKEKDKEDNKEKDKEDNKEKDKEDD 702 

Qy '4 88 GGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 

: : : I I I I : I I : I 

Db 703 KEKDKEDDKEKD KEDNKEKDKEDNKEKDKEDD 734 



RESULT 15 
A54138 

acidic repetitive protein arpl - Tetrahymena thermophila 
C; Species: Tetrahymena thermophila 

C;Date: 29-Sep-1999 #sequence_revision 29-Sep-1999 #text_change 09-Jul-2004 

C; Accession: A54138 

R;Heinonen, T.Y.; Pearlman, R.E. 

J"; Biol.:Chem. 269, 17428-17433, 1994 

A;Title:;A germ line-specific sequence element in an intron in Tetrahymena 
thermophila . 

A, ; Reference number: A54138; MUID : 942 924 95 ; PMID : 802124 5 
A; Accession : A54 138 
A; Molecule type: DNA 
A; Residues: 1-334 <HEI> 

A;Cross-references: UNIPROT :O77406; UNIPARC : UPI000007FF9E; GB:X76125; 
NID:g426479; PIDN : CAA53731 . 1; PID : el326004 ; PID:g3676249 
A; Experimental source: strain CU329, macronuclei 

A; Note: sequence extracted from NCBI backbone (NCBIN: 149332, NCBIP : 14 9333) 

C; Genetics : 

A; Gene: TAP1 

A; Genetic code: SGC5 

A;Introns: 64/1/ 158/1 

:Query Match 8.0%; Score 220; DB 2; Length 334; 

; Best Local Similarity 27.6%; Pred. No. 0.00012; 

-Matches 64; Conservative 43; Mismatches 85; Indels 40; Gaps 11; 

Q^ \ 336 RAKAGVQSGTNALLVVKHRDMNE KELEAQEARKAQLENHEPEE EEE 381 

: I : : I : I I I : I I I : : I : : I : I I : : I 

Dlj) 24 RKEKPIQKSHSA--VSKETEMTENTPKLIQDDEENADEGDNGDDEESGDSDDDSGDSDDE 81 

Qy 382 EEMETEEKEAGGSDEEQEKGSSSEKEG-SEDEHSGSESEREEGDRD EASDKSG 4 33 

' I : : : : : I : I I I : : : | I : I I : I I I I : I I I I : : I : I 

Db 82 ESGDSDDEESGDSDDQESGDSDDEESGDSDDEESGDSDDEESGDSDDDNGDSDDDNGDSD 141 

Q</ ; 434 — SGEDESSE DEARAARDKEE I FGS DADS EDDA- DS DD- E DRGQAQGGS D- 479 

: I : I : I : : I : I I I : I I I : I I I I I I I I h I I 

Db -142 EDNGDDDSNDDDNGDDENGDDAEDGDDAED--GDDAEDGDDAEDGDDAEDGDDAEDGDDA 199 

Q^ : 4 80 NDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 

i II : I : I : : I I I : I I : I I : : 



Db . 200 EDGDDAEDGDAAEDGDDAEDGDDAEDGDDNEDAEDGDDAEDGDDAEDGDDNE 251 

Search completed: April '25/ 2006, 09:11:53 
Job time: : 45 sees 



GenCore version 5.1.7 
Copyright (c) 1993 - 2006 Biocceleration Ltd. 



OM protein - protein search, using sw model 
Run on: April 25, 2006, 09:03:34 



Search time 189 Seconds 

(without alignments) 

1234.445 Million cell updates/sec 



Title: 

Perfect score: 
Sequence: 

Scoring table: 



Searched: 



US-10-721-553-2 
2764 

1 MAPTIQTQAQREDGHRPNSH. 



. QEDGSEAAASDSSEADSDSD 531 



BLOSUM62 

Gapop 10.0 , Gapext 0.5 



2443163 seqs, 439378781 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



2443163 



Post-processing: 



Database 



Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 

A_Geneseq_21: * 

1: geneseqpl980s : * 

2: geneseqpl990s : * 

3: geneseqp2000s : * 

4: geneseqp2001s : * 

5: geneseqp2002s : * 

6: geneseqp2003as : * 

7 : geneseqp2003bs : * 

8: geneseqp2004s : * 

9: geneseqp2005s : * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



% 

Result Query 

No. Score Match Length DB 



ID 



Description 



1 


2764 


100.0 


531 


2 


AAY42226 


Aay42226 


2 


2764 


100.0 


531 


7 


ADD18712 


Addl8712 


3 


2764 


100.0 


531 


8 


AD058688 


Ado58688 


4 


2764 


100.0 


531 


8 


ABM82102 


Abm82102 


5 


2744 


99.3 


531 


4 


AAB93517 


Aab93517 


6 


2658.5 


96.2 


553 


4 


ABG19682 


Abgl9682 


7 


2464 


89.1 


473 


3 


AAB56316 


Aab56316 


8 


1244.5 


45.0 


538 


4 


ABB59163 


Abb59163 



Human pan 
Human dis 
Human reg 
Tumour-as 
Human pro 
Novel hum 
Human sec 
Drosophil 



9 


622 


22 . 


c. 
D 


1 o o 
1 J J 


4 


ABG19ool 


7iKy-f 1 QCQ1 

ADgiy do i 


wovei nuiu 


10 


595 


21 . 


5 


TIC 

115 




AAGU3320 


Aaguooz o 


Human sec 


11 


283 


10 . 


Z 


4/0 


A 

4 


nnri Q/i i 9 


AKrrl Q41 9 
ADg J.y H 


KT^NT.r^T V* i urn 


12 


283 


10 . 


z 


4 / 0 


Q 
0 


ADSlZZbO 


7A/^ el 9 9 £^ 
AuS IZZoO 


Uiima vv 4~ Vv d 


13 


253 


9 . 


2 


A A CL 

445 


c 
O 


7\nncoo/i c 

ABRooz4o 


7AK y-CL *5 9 /l ^ 


rroLcin s 


14 


253 


9 . 


2 


A A C 

445 


7 


ADKooo / 0 


AOKDO D / U 


T*\ ^ a « ^ 

uiScdse t. 


15 


237 . 5 


o 

0 • 


6 


1 6 JO 


6 


ABU 4 2. D I o 




D V y^v 4— ^-v T y% 

rioccin e 


16 


234 . 5 


8 . 


5 


1802 


3 


AAY83170 


Aayo oi /u 


r*^l 1 mil 

ueii wan 


17 


234 . 5 


8 . 


5 


1802 


3 


AA Y / 0 1 1 9 


Aa y / u x i y 


O ^ r\ V\ y-^ Tjf-\ 

ouapn. ep 


18 


234 


8 . 


5 


n c o 

953 


6 


TV r>TTl C C O "3 

ABU16533 


ADUIDjOj 


"D y» j*^, 4" i t-i 

riOLein e 


19 


233. 5 


8 . 


4 


930 


2 


AAY08641 


AayUoo41 


S . aureus 


20 


233. 5 


8 . 


4 


947 


6 


ABJ18940 


AD] Id y 4 u 


Pathogen 


21 


233 


8 . 


4 


932 


4 


AAU36845 


Aau Job 4 0 


Staphyloc 


22 


233 


8 . 


4 


932 


4 


AAU34082 


AaUo4Uoz 


Staphyloc 


23 


232 . 5 


8 . 


4 


995 


6 


tv nv«n O >l O T 

ABM72437 


ADItl / Zlo i 


Staphyloc 


24 


230. 5 


8 . 


3 


1385 


6 


ABU16400 


AbUlo40U 


Protein e 


25 


230 


8 . 


3 


839 


8 


ADU02517 


AduUzol / 


Novel hum 


26 


230 


8 . 


3 


877 


6 


ABU42504 




Protein e 


27 


229 


8 . 


3 


1920 


6 


ABU43489 


T.u.. ^ o x o n 

Abu43489 


Protein e 


28 


228 


8 . 


2 


428 


5 


ABG93245 


Abgy oz4o 


C. albica 


29 


227 


8 . 


2 


567 


4 


TV TV T-l 1 T 1 il 1 

AAE13147 


Aae±Jl4 / 


Human ret 


30 


226 


8 . 


2 


743 


6 


ADA89690 


Ada o 9 o 9 U 


Staphyloc 


31 


226 


8 . 


2 


877 


6 


TVT-\TV rtrtC TO 

ADA89539 


Ada o 9 o o 9 


Staphyloc 


32 


226 


8 . 


2 


877 


6 


ABM72702 


Abm f z /vz 


Staphyloc 


33 


226 


8 . 


2 


913 


6 


ABJ18917 


Ab j 1891 / 


Pathogen 


34 


225. 5 


8 . 


2 


927 


6 


ABM7 2221 


7,U«7 O 9 9 1 

Abm 1 ZZZL 


Staphyloc 


35 


225. 5 


8 . 


2 


933 


3 


tv ■» » / r r» >i our 

AAY58435 


Aayoo 4io 


Staphyloc 


36 


225. 5 


8 . 


2 


933 


4 


AAB69508 


tv - u /C n c o 

Aabo950o 


Staphyloc 


37 


225. 5 


8 . 


2 


933 


6 


-T.T-.-ri /\ J n 

ABJ18947 


AD j 1894 / 


Pathogen 


38 


225. 5 


8 . 


2 


936 


2 


AAW89801 


Aawo 9801 


Staphyloc 


39 


224.5 


8 . 


1 


194 


4 


ABG11265 


ADgll26o 


Novel hum 


40 


224 


8 . 


1 


265 


5 


ABG32640 


Abg3264 U 


Staphyloc 


41 


223.5 


8. 


1 


1315 


2 


AAY08642 


Aay08642 


S. aureus 


42 


223.5 


8. 


1 


1315 


6 


ABJ18969 


Abjl8969 


Pathogen 


43 


222 


8. 


0 


918 


2 


AAY08640 


Aay08640 


S. aureus 


44 


221.5 


8. 


0 


565 


9 


ADW23812 


Adw23812 


Staphyloc 


45 


221.5 


8. 


0 


1132 


2 


AAR97866 


Aar97866 


Chicken 1 



ALIGNMENTS 



RESULT 1 
AAY42226 

ID AAY42226 standard; protein; 531 AA. 
XX 

AC AAY42226; 
XX 

DT 20-DEC-1999 (first entry) 
XX 

DE Human pancreatic differentiation 2 protein sequence. 
XX 

KW Human; PD2; cancer; regulation; differentiation; neoplastic; therapy; 

KW pancreatic differentiation 2; diagnosis; pancreatic adenocarcinoma; 

KW phosphoprotein. 
XX 

OS Homo sapiens. 



XX 

PN WO9950408-A1. 
XX 

PD 07-OCT-1999. 
XX 

PF 26-MAR-1999; 99WO-US006633 . 
XX 

PR 27-MAR-1998; 98US-0079649P . 
XX 

PA (UYNE-) UNIV NEBRASKA. 
XX 

PI Batra SK, Hollingsworth MA; 
XX 

DR WPI; 1999-591317/50. 

DR N-PSDB; AAZ25433. 
XX 

PT New phosphoprotein useful as targets for therapy of pancreatic 

PT adenocarcinomas . 

XX 

PS Claim 7; Fig 2; 97pp; English. 
XX 

CC The present sequence is the human pancreatic differentiation 2 (PD2) 

CC protein, which comprises an amino terminal helix-loop-helix domain and a 

CC centrally localised nuclear transporter signal and nucleotide binding 

CC site. The PD2 nucleotide sequence and a transformed host cell are useful 

CC for screening a test compounds for PD2 modulating activity indicated by 

CC an alteration in the phosphorylation of status of PD2 . The host cells are 

CC assessed for altered expression of pancreatic differentiation markers 

CC (MUC-1 or carbonic anhydrase) , and modulating activity is correlated with 

CC an alteration in cellular morphology. The PD gene and protein represent 

CC valuable targets in the differential diagnosis and therapy of pancreatic 

CC adenocarcinomas 
XX 

SQ Sequence 531 AA; 

Query Match 100.0%; Score 2764; DB 2; Length 531; 
Best Local Similarity 100.0%; Pred. No. 1.5e-196; 

Matches 531; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 

Db 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

Qy 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I 

Db 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

Qy 121 SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I 

Db 121 SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 180 

Qy 181 KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEM 240 

I I I I I I I I I I I I I I I II I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEM 240 



Qy 



241 MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 300 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 



Db 


241 


MSQAMIRGMMDEEGNQEVAYFLPVEETLKKRKRDQEEEHyiDYAPDDWDYKIAREYNWNVK 


300 


Qy 


301 


NKASKGYEENYFFIFREGDGVYYNELETRVKLSKRRAKAGVQSGTNALLVVKHRDMNEKE 


360 






1 t 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 i 1 II 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 




Db 


301 


NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 


360 


Qy 


361 


LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 


420 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 




Db 


361 


LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 


420 


Qy 


421 


EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 


480 






1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


421 


EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 


480 


Qy 


481 


DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 








1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


481 


DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 





RESULT 2 
ADD18712 

ID ADD18712 standard; protein; 531 AA. 
XX 

AC ADD18712; 
XX 

DT 15-JAN-2004 (first entry) 
XX 

DE Human disease related protein SeqID143. 
XX 

KW human; disease state; cytostatic; antiinflammatory; ophthalmological; 

KW antiarteriosclerotic; vulnerary; gene therapy; 

KW hypoxia-regulated condition; tumourigenesis ; angiogenesis ; apoptosis; 

KW inflammation; erythropoiesis ; glycolysis; gluconeogenesis ; 

KW glucose transportation; catecholamine synthesis; iron transport; 

KW nitric oxide synthesis; cancer; ischaemic condition; reperfusion injury; 

KW retinopathy; neonatal stress; pre-eclampsia; atherosclerosis; 

KW inflammatory condition; wound healing. 

XX 

OS Homo sapiens . 
XX 

PN WO2003018621-A2 . 
XX 

PD 06-MAR-2003. 
XX 

PF 23-AUG-2002; 2002WO-GB003892 . 
XX 

PR 23-AUG-2001; 2001GB-00020558 . 

PR 05-OCT-2001; 2001GB-00024037 . 
XX 

PA (OXFO-) OXFORD BIOMEDICA UK LTD. 
XX 

PI Kingsman SM, White J, Ward NR, Harris RA, Naylor S, Mundy CR; 
XX 

DR WPI; 2003-290046/28. 

DR N-PSDB; ADD18713. 
XX 

PT New substantially purified polypeptide, useful for diagnosing or treating 



PT a hypoxia-regulated condition, such as cancer, ischemia, reperfusion 

PT injury, retinopathy, pre-eclampsia, atherosclerosis, inflammation, or 

PT wound healing. 
XX 

PS Claim 25; SEQ ID NO 143; 424pp; English. 
XX 

CC This invention relates to novel human genes and gene product which are 

CC implicated in certain disease states. Compounds which modulate the 

CC proteins of the invention may have cytostatic, antiinflammatory, 

CC ophthalmological, antiarteriosclerotic or vulnerary activities. The 

CC sequences of the invention may be useful for gene therapy. The invention 

CC may be useful for diagnosing or treating a hypoxia-regulated condition, 

CC such as tumourigenesis, angiogenesis , apoptosis, inflammation, 

CC erythropoiesis, or the biological response to hypoxia conditions 

CC including processes such as glycolysis, gluconeogenesis, glucose 

CC transportation, catecholamine synthesis, iron transport or nitric oxide 

CC synthesis. The disease includes cancer, ischaemic conditions, reperfusion 

CC injury, retinopathy, neonatal stress, pre-eclampsia, atherosclerosis, 

CC inflammatory conditions or wound healing. The present sequence is that of 

CC a disease related protein of the invention. 

XX 

SQ Sequence 531 AA; 



Query Match 100.0%; Score 2764; DB 7; Length 531; 

Best Local Similarity 100.0%; Pred. No. 1.5e-196; 

Matches 531; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


i 


MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 


60 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 




Db 


i 


MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 


60 


Qy 


61 


YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 


120 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


61 


YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 


120 


Qy 


121 


SQQHAKWPWMRKTEYI STEFNRYGI SNEKPEVKI GVSVKQQFTEEEI YKDRDSQITAI E 


180 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 


SQQHAKWPWMRKTEYI STEFNRYGI SNEKPEVKIGVSVKQQ FTEEEI YKDRDSQITAI E 


180 


Qy 


181 


KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEM 


240 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVI FDSDPAPKDTSGAAALEM 


240 


Qy 


241 


MSQAMIRGMMDEEGNQFVAYFLPV^ETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 


300 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


241 


MSQAMIRGMMDEEGNQFVAYFLPV^ETLKKRKRDQEEEMDYAPDDWDYKIAREYNWNVK 


300 


Qy 


301 


NKASKGYEENYFFIFREGDGWYNELETRVl^LSKRRAKAGVQSGTNALLVVKHRDMNEKE 


360 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 




Db 


301 


NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 


360 


Qy 


361 


LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 


420 






1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


361 


LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 


420 


Qy 


421 


EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 


480 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 



Db 421 EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 480 

Qy 481 DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 



RESULT 3 
AD058688 

ID AD058688 standard; protein; 531 AA. 
XX 

AC AD058688; 
XX 

DT 15-JUL-2004 (first entry) 
XX 

DE Human regulatory molecule HRM-9. 
XX 

KW cytostatic; immunomodulator ; agonist; antagonist; gene therapy; 

KW human regulatory molecule; HRM; disease development; cell proliferation; 

KW immune response; cancer. 

XX 

OS Homo sapiens. 
XX 

PN US2002058264-A1. 
XX 

PD 16-MAY-2002. 
XX 

PF 26-SEP-2001; 2001US-00840787 . 
XX 

PR 23-SEP-1997; 97US-00933750 . 

PR 20-JAN-1999; 99US-00234613 . 

PR 03-MAR-2000; 2000US-00518865 . 
XX 

PA (INCY-) INCYTE PHARM INC. 
XX 

PI Lai P, Hillman JL, Bandman O, Shah P, Au-Young J, Yue H; 

PI Guegler KJ, Corley NC; 

XX 

DR WPI; 2004-459763/43. 

DR N-PSDB; AD058737. 
XX 

PT New human regulatory molecules, useful in the diagnosis and treatment of 

PT cancer and immune disorders. 

XX 

PS Claim 1; SEQ ID NO 9; 116pp; English. 
XX 

CC The invention describes human regulatory molecules (HRM) (I) selected 

CC from a group comprising the fully defined amino acid sequences of SEQ ID 

CC NOs: 1-49. Also described are: an isolated polynucleotide (II) comprising 

CC a nucleic acid sequence encoding (I) or the complement of the 

CC polynucleotide (SEQ ID NOs:50-98); a composition comprising (II) and a 

CC reporter molecule; an expression vector containing (II) ; a host cell 

CC containing the vector; detecting (Ml) expression of a nucleic acid in a 

CC sample; screening (M2) a plurality of molecules to identify a ligand; 

CC dagnosing (M3) a disease associated with gene expression in a sample 

CC containing nucleic acids; a composition comprising (I) and a 

CC pharmaceutical carrier or a labeling moiety; screening (M4) a plurality 



CC of molecules to identify a ligand; preparation and purification of 

CC antibodies; an antibody which specifically binds to (I); and detecting 

CC protein expression in a sample. The new human regulatory protein 

CC molecules which are expressed during disease development and the 

CC polynucleotides which encode them satisfies a need in the art by 

CC providing compositions which are useful in the diagnosis and treatment of 

CC diseases associated with cell prolif eration, particularly immune 

CC responses and cancers. This is the amino acid sequence of a human 

CC regulatory molecule. 

XX 

SQ Sequence 531 AA; 

Query Match 100.0%; Score 2764; DB 8; Length 531; 

Best Local Similarity 100.0%; Pred. No. 1.5e-196; 

Matches 531; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


1 


MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 


60 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 (1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 
M I I I I I I i i I i i ! I i i I M i 1 I I I M i i i i i i i i i i i M I i i i I I i i i i i i i i i i i i i i 




Db 


1 


MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 


60 


Qy 


61 


YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 


120 






I I 1 1 i 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

II 1 M 1 M 1 1 1 1 M II II II II II M 1 1 II II M II M 1 II II II M 1 1 II 1 II 1 II II 1 




Db 


61 


YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 


120 


Qy 


121 


SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 


180 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 
I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 




Db 


121 


SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 


180 


Qy 


181 


KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEM 


240 






1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEM 


240 


Qy 


241 


MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 


300 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


241 


MSQAMIRGMMDEEGNQFVAYFLPV^ETLKKRKRDQEEEMDYAPDDWDYKIAREYNWNVK 


300 


Qy 


301 


NKAS KGYEEN YFFI FREGDGVYYNELETRVRLS KRRAKAGVQS GTNALLWKHRDMNEKE 


360 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


301 


NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 


360 


Qy 


361 


LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 


420 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


361 


LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 


420 


Qy 


421 


EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 


480 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


421 


EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 


480 


Qy 


481 


DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 








1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 




Db 


481 


DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 





RESULT 4 
ABM82102 

ID ABM82102 standard; protein; 531 AA. 
XX 



AC ABM82102; 
XX 

DT 18-NOV-2004 (first entry) 
XX 

DE Tumour-associated antigenic target (TAT) polypeptide PRO83014, SEQ:5424. 
XX 

KW Tumour-associated antigenic target; TAT; human; overexpression; cancer; 

KW tumour; diagnosis; cell proliferative disorder; breast cancer; 

KW colorectal cancer; lung cancer; ovarian cancer; liver cancer; 

KW central nervous system cancer; bladder cancer; pancreatic cancer; 

KW cervical cancer; melanoma; leukaemia; hybridisation probe; 

KW chromosome identification; chromosome mapping; gene mapping; 

KW gene therapy; cytostatic. 

XX 

OS Homo sapiens. 
XX 

PN WO2004030615-A2 . 
XX 

PD 15-APR-2004. 
XX 

PF 29-SEP-2003; 2003WO-US028547 . 
XX 

PR 02-OCT-2002; 2002US-0414971P . 
XX 

PA (GETH ) GENENTECH INC. 
XX 

PI Wu TD, Zhang Z, Zhou Y; 
XX 

DR WPI; 2004-347921/32. 

DR N-PSDB; ACN40565. 
XX 

PT New tumor-associated antigenic target polypeptides and nucleic acids, 

PT useful in preparing a medicament for treating or detecting a 

PT proliferative disorder, e.g. breast, lung, colorectal, ovarian or 

PT prostate cancer or tumor. 

XX 

PS Claim 12; SEQ ID NO 5424; 7273pp; English. 
XX 

CC The invention relates to human tumour-associated antigenic target (TAT) 

CC polypeptides, and their related nucleic acids. The TAT polypeptides are 

CC overexpressed in cancer tissues compared to normal tissues, and may thus 

CC serve as effective targets for the diagnosis and treatment of cancer in 

CC mammals. The invention also relates to nucleic acid and polypeptide 

CC sequences at least 80% identical to the TAT nucleic acids and 

CC polypeptides; expression vectors and host cells comprising a TAT nucleic 

CC acid; an antibody specific for a TAT polypeptide; a peptide or organic 

CC molecule which binds to a TAT polypeptide; fusion proteins comprising a 

CC TAT polypeptide; and methods and compositions for the treatment or 

CC diagnosis of cancer in mammals. TAT polypeptides, nucleic acids, 

CC antibodies, antagonists, binding molecules and compositions are useful 

CC for diagnosing or treating a cell proliferative disorder associated with 

CC increased TAT expression, particularly cancers such as breast cancer, 

CC colorectal cancer, lung cancer, ovarian cancer, liver cancer, bladder 

CC cancer, pancreatic cancer, cervical cancer, cancers of the central 

CC nervous system, melanoma and leukaemia. TAT nucleic acids may further be 

CC used as hybridisation probes, in chromosome and gene mapping, in 

CC chromosome identification and in gene therapy. The present sequence 



CC represents a TAT polypeptide of the invention 
XX 

SQ Sequence 531 AA; 

Query Match 100.0%; Score 2764; DB 8; Length 531; 

Best Local Similarity 100.0%; Pred. No. 1.5e-196; 

Matches 531; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


1 


MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 


bU 






I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1 


MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 


60 


Qy 


61 


YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 


120 




I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 




Db 


61 


YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 


120 


Qy 


121 


SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 


180 




I 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 


SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 


180 


Qy 


181 


KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEM 


240 




I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


KTFEDAQKSISQHYSKPRWPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEM 


240 


Qy 


241 


MSQAMIRGMMDEEGNQFVAYFLPV^ETLKKRKRDQEEEMDYAPDDVTDYKIAREYNWNVK 


300 






I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 




Db 


241 


MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 


300 


Qy 


301 


NKAS KGYEEN YFFI FREGDGVYYNELETRVRLS KRRAKAGVQS GTNALLWKHRDMNEKE 


360 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


301 


NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 


360 


Qy 


361 


LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 


420 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


361 


LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 


420 


Qy 


421 


EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 


480 




1 I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 




Db 


421 


EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 


480 


Qy 


481 


DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


481 


DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 





RESULT 5 
AAB93517 

ID AAB93517 standard; protein; 531 AA. 
XX 

AC AAB93517; 
XX 

DT 26-JUN-2001 (first entry) 
XX 

DE Human protein sequence SEQ ID NO: 12853. 
XX 

KW Human; primer; detection; diagnosis; antisense therapy; gene therapy. 
XX 



OS Homo sapiens. 
XX 

PN EP1074617-A2. 
XX 

PD 07-FEB-2001. 
XX 

PF 28-JUL-2000; 2000EP-00116126 . 
XX 

PR 29-JUL-1999; 99 JP-00248036 . 

PR 27-AUG-1999; 99 JP-00300253 . 

PR ll-JAN-2000; 2000 JP-00118776 . 

PR 02-MAY-2000; 2000 JP-00183767 . 

PR 09-JUN-2000; 2000 JP-00241899 . 
XX 

PA (HELI-) HELIX RES INST. 
XX 

PI Ota T, Isogai T / Nishikawa T, Hayashi K, Saito K, Yamamoto J; 

PI Ishii S, Sugiyama T, Wakamatsu A, Nagai K, Otsuki T; 

XX 

DR WPI; 2001-318749/34. 
XX 

PT Primer sets for synthesizing polynucleotides, particularly the 5602 full- 

PT length cDNAs defined in the specification, and for the detection and/or 

PT diagnosis of the abnormality of the proteins encoded by the full-length 

PT cDNAs . 
XX 

PS Claim 8; SEQ ID NO 12853; 2537pp + Sequence Listing; English. 
XX 

CC The present invention describes primer sets for synthesising 5602 full- 

CC length cDNAs defined in the specification. Where a primer set comprises: 

CC (a) an oligo-dT primer and an oligonucleotide complementary to the 

CC complementary strand of a polynucleotide which comprises one of the 5602 

CC nucleotide sequences defined in the specification, where the 

CC oligonucleotide comprises at least 15 nucleotides; or (b) a combination 

CC of an oligonucleotide comprising a sequence complementary to the 

CC complementary . strand of a polynucleotide which comprises a 5 1 -end 

CC sequence and an oligonucleotide comprising a sequence complementary to a 

CC polynucleotide which comprises a 3' -end sequence, where the 

CC oligonucleotide comprises at least 15 nucleotides and the combination of 

CC the 5 f -end sequence/3 1 -end sequence is selected from those defined in the 

CC specification. The primer sets can be used in antisense therapy and in 

CC gene therapy. The primers are useful for synthesising polynucleotides, 

CC particularly full-length cDNAs . The primers are also useful for the 

CC detection and/or diagnosis of the abnormality of the proteins encoded by 

CC the full-length cDNAs . The primers allow obtaining of the full-length 

CC cDNAs easily without any specialised methods. AAH03166 to AAH13628 and 

CC AAH13633 to AAH18742 represent human cDNA sequences; AAB92446 to AAB95893 

CC represent human amino acid sequences; and AAH13629 to AAH13632 represent 

CC oligonucleotides, all of which are used in the exemplification of the 

CC present invention 

XX 

SQ Sequence 531 AA; 



Query Match 99.3%; Score 2744; DB 4; Length 531; 

Best Local Similarity 99.4%; Pred. No. 4.5e-195; 

Matches 528; Conservative 0; Mismatches 3; Indels 0; Gaps 0; 



Qy 


1 


MAP T I QTQAQRE DGH R PN S H RT L P ERS GWC RVK YCN S L P D IPFDPKFITYPFDQNKFVQ 


bU 






1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 




Db 


1 


MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRIVQ 


60 


Qy 


61 


YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 


1^0 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 




Db 


61 


YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 


120 


Qy 


121 


SQQHAKWPWMRKTEYI STEFNRYGI SNEKPEVKI GVSVKQQFTEEEI YKDRDSQITAI E 


lo 0 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 




Db 


121 


SQQEAKWPWMRKTEYI STEFNRYGI SNEKPGVKI GVSVKQQFTEEEI YKDRDSQITAI E 


180 


Qy 


181 


KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVI FDSDPAPKDTSGAAALEM 


240 






II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVI FDSDPAPKDTSGAAALEM 


240 


Qy 


241 


MSQAMI RGMMDEEGNQFVAYFLPVFIETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 


300 






1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 




Db 


241 


MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDWDYKIAREYNWNVK 


300 


Qy 


301 


NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 


360 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


301 


NKASKGYEENYFFI FREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 


360 


Qy 


361 


LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 


420 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 




Db 


361 


LEAQEARKAQLENHEPEGEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 


420 


Qy 


421 


EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 


480 




1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


421 


EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 


480 


Qy 


481 


DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 








1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 




Db 


481 


DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 





RESULT 6 




ABG19682 




ID 


ABG19682 standard; protein; 553 AA. 




XX 






AC 


ABG19682; 




XX 






DT 


18-FEB-2002 (first entry) 




XX 






DE 


Novel human diagnostic protein #19673. 




XX 






KW 


Human; chromosome mapping; gene mapping; gene 


therapy; forensic; 


KW 


food supplement; medical imaging; diagnostic; 


genetic disorder. 


XX 






OS 


Homo sapiens. 




XX 






PN 


WO200175067-A2. 




XX 






PD 


ll-OCT-2001. 




XX 






PF 


30-MAR-2001; 2001WO-US008631 . 





PR 31-MAR-2000; 2000US-00540217 . 

PR 23-AUG-2000; 2000US-00649167 . 
XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Drmanac RT, Liu C, Tang YT; 
XX 

DR WPI; 2001-639362/73. 

DR N-PSDB; AAS83869. 
XX 

PT New isolated polynucleotide and encoded polypeptides , useful in 

PT diagnostics , forensics f gene mapping, identification of mutations 

PT responsible for genetic disorders or other traits and to assess 

PT biodiversity. 
XX 

PS Claim 20; SEQ ID NO 50041; 103pp; English. 
XX 

CC The invention relates to isolated polynucleotide (I) and polypeptide (II) 

CC sequences. (I) is useful as hybridisation probes, polymerase chain 

CC reaction (PCR) primers, oligomers, and for chromosome and gene mapping, 

CC and in recombinant production of (II) . The polynucleotides are also used 

CC in diagnostics as expressed sequence tags for identifying expressed 

CC genes. (I) is useful in gene therapy techniques to restore normal 

CC activity of (II) or to treat disease states involving (II). (II) is 

CC useful for generating antibodies against it, detecting or quantitating a 

CC polypeptide in tissue, as molecular weight markers and as a food 

CC supplement. (II) and its binding partners are useful in medical imaging 

CC of sites expressing (II). (I) and (II) are useful for treating disorders 

CC involving aberrant protein expression or biological activity. The 

CC polypeptide and polynucleotide sequences have applications in 

CC diagnostics, forensics, gene mapping, identification of mutations 

CC responsible for genetic disorders or other traits to assess biodiversity 

CC and to produce other types of data and products dependent on DNA and 

CC amino acid sequences. ABG00010-ABG30377 represent novel human diagnostic 

CC amino acid sequences of the invention. Note: The sequence data for this 

CC patent did not appear in the printed specification, but was obtained in 

CC electronic format directly from WIPO at 

CC ftp . wipo . int/pub/published_j)ct_sequences 

XX 

SQ Sequence 553 AA; 

Query Match 96.2%; Score 2658.5; DB 4; Length 553; 
Best Local Similarity 95.0%; Pred. No. l.le-188; 

Matches 515; Conservative 6; Mismatches 10; Indels 11; Gaps 1 

Qy 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I M I I I I I I 

Db 12 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 71 

Qy 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 72 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 131 

Qy 121 SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 180 

I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I II I II I I I I I I I I I I I I I I I I I 

Db 132 SQQHAKWPWMRKTEYISTEFNRYCIFHEKPEVKKWGSVKQQFTEEEIYKDRDSQITAIE 191 



Qy 


181 


KTFEDAQKS ISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAP 


229 




1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


192 


KTF EDAQKSVIEGLQjWbhAAKloyrli oKrHv 1 rVbVJYlt'VE rur KJYLWIN fLAy VI r UoJJrAr 


9 s 1 


Qy 


230 


KDTSGAAALEMMSQAMIRGI^DEEGNQFVAYFLPVEETLKKRKRDQEEEIMDYAPDDVYDY 


289 






I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


0 c 0 

252 


KDTSGAAAJLEMMSQAMIRGMMDb&GNyr VAit UrvtittL ijKJ\Kx\KUyb^EJYlUi/Vh'lJUV iUx 


j 1 1 


Qy 


290 


KIAREYNW^^v^^KASKGYEENYFFIFREGDGWYNELETRVRLSKRRAKAGVQSGTNALL 


349 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 




Db 


312 


KIAREYNWNvTCNKASKGYEENYFFI FREGDGVYYNELETRVKLSKRRAi^AGVQbGl NALL 


O / ± 


Qy 


350 


WKHRDMNEKELEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGS 


409 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


372 


VVT\HRDMNEKELEAQETRKAQLENHEPEEEEEEEMETEEKEAGGSDEE^ 


flol 


Qy 


410 


EDEHSGSESEREEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDE 


469 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 : 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


432 


EDEHSGSESEREEGDRHEASDKSGSGQDDSSDYXARAARDKEEIFGSDADSEDDADSDDE 


491 


Qy 


470 


DRGQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSD 


529 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 llllhllllll III III 1 1 IMI 1 hlllll llllll III 1 




Db 


492 


DRGQAQGGSDNDSDSGRNGGGQRTRSHSRSASPFPSGSEHSAQENGSEAAASDSSEADSD 


551 


Qy 


530 


SD 531 




Db 


552 


1 1 

SD 553 





RESULT 7 
AAB56316 

ID AAB56316 standard; protein; 473 AA. 
XX 

AC AAB56316; 
XX 

DT 13-MAR-2001 (first entry) 
XX 

DE Human secreted protein sequence encoded by gene 106 SEQ ID NO: 410. 
XX 

KW Human; secreted protein; diagnosis; immunosuppressive; antiarthritic; 

KW antirheumatic; antiproliferative; cytostatic; cardiant; vasotropic; 

KW cerebroprotective; nootropic; neuroprotective; antibacterial; virucide; 

KW fungicide; ophthalmological; gene therapy; pathological condition; 

KW autoimmune disease; rheumatoid arthritis; hyperproliferative disorder; 

KW neoplasm; cardiovascular disorder; cardiac arrest; cerebral ischaemia; 

KW cerebrovascular disorder; angiogenesis ; nervous system disorder; 

KW Alzheimer f s disease; infection; ocular disorder; corneal infection; 

KW wound healing; skin aging; food additive; preservative. 

XX 

OS Homo sapiens. 
XX 

PN WO200070042-A1. 
XX 

PD 23-NOV-2000. 
XX 

PF ll-MAY-2000; 2000WO-US012788 . 



PR 13-MAY-1999; 99US-0134068P . 
XX 

PA (HUMA-) HUMAN GENOME SCI INC. 
XX 

PI Rosen CA, Ruben SM, Moore PA, Young PE, Komatsoulis GA, Birse CE; 

PI Duan RD, Florence KA, Soppet DR; 

XX 

DR WPI; 2000-679828/66. 
XX 

PT Isolated nucleic acid molecule encoding a human secreted protein is used 

PT in preventing, treating or ameliorating a medical condition. 

XX 

PS Disclosure; Page 1041-1042; 1065pp; English. 
XX 

CC The polynucleotide sequences given in AAC99818 to AAC99977 encode the 

CC human secreted proteins given in AAB56077 to AAB56362. Human secreted 

CC proteins have activities based on the tissues and cells the genes are 

CC expressed in. Examples of activities include: immunosuppressive; 

CC antiarthritic; antirheumatic; antiproliferative; cytostatic; cardiant; 

CC vasotropic; cerebroprotective; nootropic; neuroprotective; antibacterial; 

CC virucide; fungicide; and opthalmalogical . The human secreted 

CC polynucleotides and proteins can be used to prevent, treat or ameliorate 

CC a medical condition in e.g. humans, mice, rabbits, goats, horses, cats, 

CC dogs, chickens or sheep. They are also used in diagnosing a pathological 

CC condition or susceptibility to a pathological condition. Disorders which 

CC are diagnosed or treated include autoimmune diseases e.g. rheumatoid 

CC arthritis, hyperprolif erative disorders e.g. neoplasms of the breast or 

CC liver, cardiovascular disorders e.g. cardiac arrest, cerebrovascular 

CC disorders e.g. cerebral ischaemia, angiogenesis, nervous system disorders 

CC e.g. Alzheimer 1 s disease, infections caused by bacteria, viruses and 

CC fungi and ocular disorders e.g. corneal infection. The proteins can also 

CC be used to aid wound healing and epithelial cell proliferation, to 

CC prevent skin aging due to sunburn, to maintain organs before 

CC transplantation, for supporting cell culture of primary tissues, to 

CC regenerate tissues and in chemotaxis. The proteins can also be used as a 

CC food additive or preservative to increase or decrease storage 

CC capabilities. AAC99809 to AAC99817 and AAB56076 represent sequences used 

CC in the exemplification of the present invention 

XX 

SQ Sequence 473 AA; 

Query Match 89.1%; Score 2464; DB 3; Length 473; 

Best Local Similarity 99.8%; Pred. No. 2.4e-174; 

Matches 472; Conservative 0; Mismatches 1; Indels 0; Gaps 0 



Qy 


1 


MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 


60 






I I I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1 


MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 


60 


Qy 


61 


YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 


120 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 




Db 


61 


YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 


120 


Qy 


121 


SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 


180 






1 || II 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 


SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 


180 



Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qv 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 



KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEM 240 

I | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 1 I I I M > M M I I I II I 1 I 



I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M M 

MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 300 

NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 360 

| | I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 360 



MINIUM I M I I MMI II I I I MM M I II I I I I I llllllllll 

LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSXKEGSEDEHSGSESER 420 

EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQ 473 
| I I I I I I II I I I I I I I I I I M I M I I I I I I I I I I I I I I I I I I I I I M I I I I I I 
EEGDRDEASDKSGSGEDES SEDEARAARDKEEI FGSDADSEDDADSDDEDRGQ 473 



RESULT 8 


ABB59163 


ID 


ABB59163 standard; protein; 538 AA. 


XX 




AC 


ABB59163; 


XX 




DT 


26-MAR-2002 (first entry) 


XX 




DE 


Drosophila melanogaster polypeptide SEQ ID NO 4281. 


XX 


Drosophila; developmental biology; cell signalling; insecticide; 


KW 


KW 


pharmaceutical . 


XX 




OS 


Drosophila melanogaster. 


XX 




PN 


WO200171042-A2. 


XX 




PD 


27-SEP-2001. 


XX 




PF 


23-MAR-2001; 2001WO-US009231 . 


XX 




PR 


23-MAR-2000; 2000US-0191637P . 


PR 


ll-JUL-2000; 2000US-00614150. 


XX 




PA 


(PEKE ) PE CORP NY. 


XX 




PI 


Venter JC, Adams M, Li PWD, Myers EW; 


XX 




DR 


WPI; 2001-656860/75. 


DR 


N-PSDB; ABL03266. 


XX 


New isolated nucleic acid detection reagent for detecting 1000 or more 


PT 


PT 


genes from Drosophila and for elucidating cell signaling and cell-cell 


PT 


interactions . 


XX 





PS Disclosure; SEQ ID NO 4281; 21pp + Sequence Listing; English. 
XX 

CC The invention relates to an isolated nucleic acid detection reagent 

CC capable of detecting 1000 or more genes from Drosophila. The invention is 

CC useful in developmental biology and in elucidating cell signalling and 

CC cell-cell interactions in higher eukaryotes for the development of 

CC insecticides , therapeutics and pharmaceutical drugs. The invention 

CC discloses genomic DNA sequences (ABL16176-ABL30511) , expressed DNA 

CC sequences (ABL01840-ABL16175 ) and the encoded proteins (ABB57737- 

CC ABB72072) . The sequence data for this patent did not form part of the 

CC printed specification, but was obtained in electronic format directly 

CC from WIPO at ftp.wipo.int/pub/published__pct_sequences 

XX 

SQ Sequence 538 AA; 

Query Match 45.0%; Score 1244.5; DB 4; Length 538; 

Best Local Similarity 50.0%; Pred. No. l.le-83; 

Matches 271; Conservative 66; Mismatches 172; Indels 33; Gaps 11; 

Qy 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

I I I I I : I : I :: I I I I I I : I I I I I I I I I : I I I I : I I I I 

Db 1 MPPTINNSAVNSAAEK-RPQRQTERKSEIICRVKYGNNLPDIPFDLKFLQYPFDSHRFVQ 59 

Qy 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

I MM: I : I : I II I I I I I : I I I I : I : I I I I I I I I I I I I I I I I I I 
Db 60 YNPTSLERNFKYDVLTEHDLGVTVDLINRELYQADSMTLLDPADEKLLEEETLTPTDSVR 119 



Qy 121 SQQHAKWPWMRKTEYISTEFNRYGISN-EKPEVKIGVSVKQQFTEEEIYKDRDSQITAI 179 

I : I I : : I I : I I : I I I I I I I : II I I : I : I I : I I : I I I :: I I I I 

Db 120 SRQHSRTVSWLRKSEYISTEQTRFQPQNLENIEAKVGYNVKKSLREETLYLDREAQIKAI 179 

Qy 180 EKTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALE 239 

Mill: I : : I II I I I II I I : I : I II I I I I I I I I I I I II II : III 

Db 180 EKTFSDTKSEITKHYSKPNWPVEVLPIFPDFTNWKFPCAQVI FDSDPAPAGKNVPAQLE 239 



Qy 240 MMSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNV 299 

I I I II I I I : I I I I I I II I I I I I : I I : I I : I I :: I : I M M I M M I 

Db 240 EMSQAMIRGVMDESGEQFVAYFLPTEQTLEKRRTDFINGELYKEEEEYEYKIAREYNWNV 299 

Qy 300 KNKASKGYEENYFFIFREGDGvyYNELETRWLSKRRAKAGVQSGTNALLVVKHRDMNEK 359 

I I II I I I I I I II I : I : II : I II I I I I I I I I : I I I I I I I I I I I I I : : 
Db 300 KTKASKGYEENYFFVMRQ-DGIYYNELETRVKLNKRRWVG-QQPNNTKLVVKHRPLDSM 357 

Qy 360 ELEAQEARKAQLENHEPEEE EEEEM ETEE KEAGGSD 395 

I 11:111 III II I : I III: : I I : I 

Db 358 EH RMQ RYRERQ LEVP GEE E E I VE EVREE EQMQ 1 1 GET EKT S EDAAVGAQAAS GAD S P AQV 417 

Qy 396 — EEQEKGSSSEKEGSEDEHSGSESEREEGDRDEASDKSGSGEDESSEDEARAARDKEEI 453 

: I : I : I I I II I I : : : I II I : I ' ' 

Db 418 ARDRQSRSRSRTRSGS-SSGSGSGSGSRASSRSKSGSRSGSGSRSRTNSPAGSQKSGSR- 475 

Qy 454 FGSDADSEDDADSDDEDRGQAQGGSDNDSDSGS-NGGGQRSRSHSRSASPFPSGSEHSAQ 512 

1:1 : I I : : : I : I I I I : I I I I I I I I I I II I 
Db 476 SRSVSRSRSRSKSGSRSRSRSRSKSGSRSRSGSRSGSGSRSPSRSRSGSPSGSGSSSGSA 535 



Qy 



513 ED 514 
I 



Db 



536 SD 537 



RESULT 9 
ABG19681 

ID ABG19681 standard; protein; 133 AA. 
XX 

AC ABG19681; 
XX 

DT 18-FEB-2002 (first entry) 
XX 

DE Novel human diagnostic protein #19672. 
XX 

KW Human; chromosome mapping; gene mapping; gene therapy; forensic; 

KW food supplement; medical imaging; diagnostic; genetic disorder. 
XX 

OS Homo sapiens. 
XX 

PN WO200175067-A2. 
XX 

PD ll-OCT-2001. 
XX 

PF 30-MAR-2001; 2001WO-US008631 . 
XX 

PR 31-MAR-2000; 2000US-00540217 . 

PR 23-AUG-2000; 2000US-00649167 . 
XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Drmanac RT, Liu C, Tang YT; 
XX 

DR WPI; 2001-639362/73. 

DR N-PSDB; AAS83868. 
XX 

PT New isolated polynucleotide and encoded polypeptides, useful in 

PT diagnostics, forensics, gene mapping, identification of mutations 

PT responsible for genetic disorders or other traits and to assess 

PT biodiversity. 
XX 

PS Claim 20; SEQ ID NO 50040; 103pp; English. 
XX 

CC The invention relates to isolated polynucleotide (I) and polypeptide (II) 

CC sequences. (I) is useful as hybridisation probes, polymerase chain 

CC reaction (PCR) primers, oligomers, and for chromosome and gene mapping, 

CC and in recombinant production of (II) . The polynucleotides are also used 

CC in diagnostics as expressed sequence tags for identifying expressed 

CC genes. (I) is useful in gene therapy techniques to restore normal 

CC activity of (II) or to treat disease states involving (II). (II) is 

CC useful for generating antibodies against it, detecting or quantitating a 

CC polypeptide in tissue, as molecular weight markers and as a food 

CC supplement. (II) and its binding partners are useful in medical imaging 

CC of sites expressing (II) . (I) and (II) are useful for treating disorders 

CC involving aberrant protein expression or biological activity. The 

CC polypeptide and polynucleotide sequences have applications in 

CC diagnostics, forensics, gene mapping, identification of mutations 

CC responsible for genetic disorders or other traits to assess biodiversity 

CC and to produce other types of data and products dependent on DNA and 



CC amino acid sequences. ABG00010-ABG30377 represent novel human diagnostic 

CC amino acid sequences of the invention. Note: The sequence data for this 

CC patent did not appear in the printed specification, but was obtained in 

CC electronic format directly from WIPO at 

CC ftp . wipo . int/pub/published_pct_sequences 
XX 

SQ Sequence 133 AA; 

Query Match 22.5%; Score 622; DB 4; Length 133; 

Best Local Similarity 64.4%; Pred. No. 2.9e-38; 

Matches 130; Conservative 0; Mismatches 2; Indels 70; Gaps 1; 

Qy 273 RDQEEE^DYAPDDV^DYKIAREYNWm^KASKGYEENYFFIFREGDGWYNELETRVl^^ 332 

I I I I I I I I I I I 1 I I I I I I I I I I I I II I I I I II I I I 

Db 1 RDQEEEMDYAPDDVYDYKIAREYNWNVKNKASKGY 35 

Qy 333 SKRRAKAGVQSGTNALLWKHRDMNEKELEAQEARKAQLENHEPEEEEEEEMETEEKEAG 392 

I I I I I I I I I I I I I I I 

Db 36 EEEEEEMETEEKEAG 50 

Qy 393 GSDEEQEKGSSSEKEGSEDEHSGSESEREEGDRDEASDKSGSGEDESSEDEARAARDKEE 452 

II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 

Db 51 GSYEEQEKGSSSEKEGSEDEHSGSESEREEGDRDEASDKSGSGEDESSEDEARAARDKEE 110 

Qy 453 IFGSDADSEDDADSDDEDRGQA 474 

I I I I I I I I I I I II I I I I I I I I 
Db 111 IFGSDADSEDDADSYDEDRGQA 132 



RESULT 10 
AAG03326 

ID AAG03326 standard; protein; 115 AA. 
XX 

AC AAG03326; 
XX 

DT 06-OCT-2000 (first entry) 
XX 

DE Human secreted protein, SEQ ID NO: 7407. 
XX 

KW Human; 5 ? EST; expressed sequence tag; secreted protein; cDNA isolation; 

KW gene therapy; chromosome mapping. 

XX 

OS Homo sapiens. 
XX 

PN EP1033401-A2. 
XX 

PD 06-SEP-2000. 
XX 

PF 21-FEB-2000; 2000EP-00200610 . 
XX 

PR 26-FEB-1999; 99US-0122487P . 
XX 

PA (GEST ) GENSET. 
XX 

PI Dumas Milne Edwards J, Duclert A, Giordano J; 
XX 

DR WPI; 2000-500381/45. 



DR N-PSDB; AAC03332. 
XX 

PT New nucleic acid that is a 5' expressed sequence tag (5' EST) for 

PT obtaining cDNAs and genomic DNAs that correspond to 5 1 ESTs and for 

PT diagnostic, forensic, gene therapy and chromosome mapping procedures. 
XX 

PS Claim 13; SEQ ID NO 7407; 71pp + Sequence Listing; English. 
XX 

CC The present sequence is a polypeptide encoded by one of a large number of 

CC 5 ' ESTs derived from mRNAs encoding secreted proteins . The 5 1 ESTs were 

CC prepared from total human RNAs or polyA+ RNAs derived from 30 different 

CC tissues. EST sequences usually correspond mainly to the 3 f untranslated 

CC region (UTR) of the mRNA because they are often obtained from oligo-dT 

CC primed cDNA libraries. Such ESTs are not well suited for isolating cDNA 

CC sequences derived from the 5' ends of mRNAs and even in those cases where 

CC longer cDNA sequences have been obtained, the full 5' UTR is rarely 

CC included. 5' ESTs are derived from mRNAs with intact 5 1 ends and can 

CC therefore be used to obtain full length cDNAs and genomic DNAs. 5 1 ESTs 

CC are also used in diagnostic, forensic, gene therapy and chromosome 

CC mapping procedures. They are used to obtain upstream regulatory sequences 

CC and to design expression and secretion vectors 
XX 

SQ Sequence 115 AA; 

Query Match 21.5%; Score 595; DB 3; Length 115; 

Best Local Similarity 96.5%; Pred. No. 2.5e-36; 

Matches 111; Conservative 0; Mismatches 4; Indels 0; Gaps 0; 

Qy 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

I II I I I II I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAPTIQTQAQREDGHRPNSHRTLPXXSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

Qy 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAP 115 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I 
Db 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEXEIQXP 115 



RESULT 11 
ABG19412 

ID ABG19412 standard; protein; 475 AA. 
XX 

AC ABG19412; 
XX 

DT 13-FEB-2002 (first entry) 
XX 

DE Novel human diagnostic protein #19403. 
XX 

KW Human; chromosome mapping; gene mapping; gene therapy; forensic; 

KW food supplement; medical imaging; diagnostic; genetic disorder. 
XX 

OS Homo sapiens . 
XX 

PN WO200175067-A2. 
XX 

PD ll-OCT-2001. 
XX 

PF 30-MAR-2001; 2001WO-US008631 . 



XX 

PR 31-MAR-2000; 2000US-00540217 . 

PR 23-AUG-2000; 2000US-00649167 . 
XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Drmanac RT, Liu C, Tang YT; 
XX 

DR WPI; 2001-639362/73. 

DR N-PSDB; AAS83599. 
XX 

PT New isolated polynucleotide and encoded polypeptides , useful in 

PT diagnostics, forensics, gene mapping, identification of mutations 

PT responsible for genetic disorders or other traits and to assess 

PT biodiversity. 
XX 

PS Claim 20; SEQ ID NO 49771; 103pp; English. 
XX 

CC The invention relates to isolated polynucleotide (I) and polypeptide (II) 

CC sequences. (I) is useful as hybridisation probes, polymerase chain 

CC reaction (PCR) primers, oligomers, and for chromosome and gene mapping, 

CC and in recombinant production of (II). The polynucleotides are also used 

CC in diagnostics as expressed sequence tags for identifying expressed 

CC genes. (I) is useful in gene therapy techniques to restore normal 

CC activity of (II) or to treat disease states involving (II). (II) is 

CC useful for generating antibodies against it, detecting or quantitating a 

CC polypeptide in tissue, as molecular weight markers and as a food 

CC supplement. (II) and its binding partners are useful in medical imaging 

CC of sites expressing (II) . (I) and (II) are useful for treating disorders 

CC involving aberrant protein expression or biological activity. The 

CC polypeptide and polynucleotide sequences have applications in 

CC diagnostics, forensics, gene mapping, identification of mutations 

CC responsible for genetic disorders or other traits to assess biodiversity 

CC and to produce other types of data and products dependent on DNA and 

CC amino acid sequences. ABG00010-ABG30377 represent novel human diagnostic 

CC amino acid sequences of the invention. Note: The sequence data for this 

CC patent did not appear in the printed specification, but was obtained in 

CC electronic format directly from WIPO at 

CC ftp . wipo . int/pub/published_pct_sequences 

XX 

SQ Sequence 475 AA; 

Query Match 10.2%; Score 283; DB 4; Length 475; 

Best Local Similarity 87.7%; Pred. No. 2.3e-12; 

Matches 57; Conservative 3; Mismatches 5; Indels 0; Gaps 0; 

Qy 328 TRVT^LSKRRAKAGVQSGTNALLVVKHRDMNEKELEAQEARKAQLENHEPEEEEEEEMETE 387 

: I I I I II I I I II I I I I I I I I I I I I I t I I II I I I I I I I II I I I I I I I I I I I I I I I I I : 
Db 3 SRVRLSKRRAKAGVQSGTNALLWKHRDMNEKELEAQEARKAQLENHEPEEEEEEEIRQP 62 

Qy 388 EKEAG 392 

I : I 

Db 63 RKKLG 67 



RESULT 12 
ADS12265 



ID ADS12265 standard; protein; 475 AA. 
XX 

AC ADS12265; 
XX 

DT 16-DEC-2004 (first entry) 
XX 

DE Human therapeutic contig protein - SEQ ID 2502. 
XX 

KW antiinflammatory; neuroprotective; antianaemic; cytostatic; vulnerary; 

KW inflammatory; haematopoiesis ; immunity; neurodegenerative; stem cell; 

KW aplastic anaemia; cancer; wound healing; gene therapy. 
XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT Misc-dif ference 1. .475 

FT /label= Unknown, OTHER 

FT /note= "OTHER = In-frame STOP codon" 

XX 

PN WO2004080148-A2 . 
XX 

PD 23-SEP-2004. 
XX 

PF 30-SEP-2003; 2003WO-US030720 . 
XX 

PR 02-OCT-2002; 2002US-0416186P . 
XX 

PA (NUVE-) NUVELO INC. 
XX 

PI Tang YT, Asundi V, Ren F, Zhang J, Wehrman T, Wang Z, Ma Y; 

PI Wang D, Chen R, Zhao QA, Wang J, Ghosh M, Xue AJ, Weng G, Zhou P; 

XX 

DR WPI; 2004-668857/65. 

DR N-PSDB; ADS11667. 
XX 

PT New polynucleotide, useful in preparing a composition for diagnosing or 

PT treating inflammatory, neurodegenerative or stem cell disorders, e.g., 

PT aplastic anemia or cancer for promoting wound healing. 
XX 

PS Example 2; SEQ ID NO 2502; 718pp; English. 
XX 

CC The invention relates to a novel isolated polynucleotide and the encoded 

CC polypeptide. The molecules of the invention demonstrate antiinflammatory, 

CC neuroprotective, antianaemic, cytostatic and vulnerary activities and may 

CC be useful in preparing a composition for diagnosing or treating 

CC inflammatory, haematopoietic, immune, neurodegenerative or stem cell 

CC disorders, such as aplastic anaemia or cancer, as well as for promoting 

CC wound healing. The molecules may also be utilised during gene therapy 

CC procedures. The current sequence is that of a human therapeutic contig 

CC protein of the invention. 

XX 

SQ Sequence 475 AA; 

Query Match 10.2%; Score 283; DB 8; Length 475; 

Best Local Similarity 87.7%; Pred. No. 2.3e-12; 

Matches 57; Conservative 3; Mismatches 5; Indels 0; Gaps 0; 



Qy 328 TRVRLSKRRAKAGVQSGTNALLWKHRDMNEKELEAQEARKAQLENHEPEEEEEEEMETE 387 

: I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : 
Db 3 SRVRLSKRRAKAGVQSGTNALLWKHRDMNEKELEAQEARKAQLENHEPEEEEEEEIRQP 62 

Qy 388 EKEAG 392 

I: I 

Db 63 RKKLG 67 



RESULT 13 
ABR53245 

ID ABR53245 standard; protein; 445 AA. 
XX 

AC ABR53245; 
XX 

DT 20-JUN-2003 (first entry) 
XX 

DE Protein sequence #SEQ ID 1355. 
XX 

KW Multiprotein complex; eukaryote; drug target; diagnosis. 
XX 

OS Saccharomyces cerevisiae. 
XX 

PN EP1258494-A1. 
XX 

PD 20-NOV-2002. 
XX 

PF 20-DEC-2001; 2001EP-00130253 . 
XX 

PR 15-MAY-2001; 2001EP-00111774 . 
XX 

PA (CELL- ) CELLZOME AG. 
XX 

PI Bauer A, Gavin A, Grandi P, Krause R, Kruse UD, Kuester BD; 

PI Marzioch M, Schultz JD, Superti-Furga GD; 

XX 

DR WPI; 2003-250078/25. 

DR N-PSDB; ACC61287. 
XX 

PT New isolated protein complexes useful for diagnosing a disease or 

PT disorder, or as a target for an active agent of a pharmaceutical, 

PT preferably a drug target in the treatment or prevention of disease or 

PT disorder. 

XX 

PS Disclosure; SEQ ID NO 1355; 17pp + Sequence Listing; English. 
XX 

CC The invention relates to multiprotein complexes from eukaryotes. Proteins 

CC of the invention and DNA sequences encoding them are given in records 

CC ABR52568-ABR53903 and ACC60610-ACC61944 respectively. The complexes are 

CC obtainable by using a protein as a bait and isolating the set of proteins 

CC which is attached thereto from cells. Such protein complexes may comprise 

CC up to 30 distinct proteins. Protein complexes of the invention are useful 

CC for diagnosing a disease or disorder, or as a target for an active agent 

CC of a pharmaceutical, preferably a drug target in the treatment or 

CC prevention of a disease or disorder. Note: The sequence data for this 

CC patent is not represented in the printed specification, but is based on 

CC sequence information supplied by the European Patent Office. The complete 



CC document is available on CD-ROM 
XX 

SQ Sequence 445 AA; 

Query Match 9.2%; Score 253; DB 6; Length 445; 

Best Local Similarity 22.1%; Pred. No. 3.6e-10; 

Matches 112; Conservative 96; Mismatches 184; Indels 114; Gaps 21; 

Qy 23 LPERSGWCRVKYCNSLPDIPFDPKFITYP FDQNRFVQ YKAT S LEKQHK 71 

: :: : :|| MM M : M : Ml M 

Db 1 MSKKQEYIAPIKYQNSLPVPQLPPKLLVYPESPETNADSSQLINSLYIKTNVTNLIQQ — 58 

Qy 72 HDLLTEPDLGVTIDLI NPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKRS 121 

: MM MM : I I M M I M : I : : 
Db 59 DEDLGMPVDLMKFPGLLNKLDSKLLYGFD-NVKLDKDDRILLRD PRIDRLT 108 

Qy 122 QQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQ 175 

: I : : | : I M : I : : : | : : : I I I 

Db 109 KTDI S KVT FLRRTE YVSNT I AAHDNTS LKRKRRL DDGDS DDENLDV 154 

Qy 176 1 TAI EKT FEDAQKS I S QH YS KP RVT P VEVMP VFP D FKMWI N P CAQVI FD S D PAP KDT 232 

M : I M I M I I M : M Ml. 
Db 155 NHIISRV^GTFNKTDK--WQHPVT<KGVKMVKKWDLLPD TASMDQVYF ILKF 203 

Qy 233 SGAAALEMMSQAMIR-GM MDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYD 288 

M M M : : M : : | : : : : : : : : : M M M : : 

Db 204 MGSASLDTKEKKSLNTGI FRPVELEEDEWI SMYATDHKDSAI LENELEKGMDEMDDDSHE 263 

Qy 289 YKIAREYNWNVKNKASKGYEENYFFIFREGDGV-YYNELETRVRLSKRRAKAGVQSG 344 

M I : : : : I II I : I M I I I : : : I : M : : 

Db 264 GKI YKFKRI RDYDMKQVAEKPMTE-LAI RLNDKDGIAYYKPLRSKI ELRRRRVNDI I KP- 321 

Qy 345 TNALLWKH RDMNEKELEAQEARKAQLEN HEPEEEEEEEMETEEK 389 

MM I : : II : : : : : I : :\ i \ M I : I 

Db 322 LVKEHDIDQLNVTLRNPSTKEANIRDKLRMKFDPINFATVDEEDDEDEEQPEDVKK 377 

Qy 390 EAGGSDEEQEKGSSSEKEGSEDEHSGSESEREEGDRDEASDKSGSGEDESSEDEARAARD 449 

Ml : : : I I II Ml MM : : I : Ml Ml 

Db 378 ESEG — DSKTEGSEQEGENEKDEEIKQEKENEQ DEENKQDENRAADT 422 

Qy 450 KEEI FGS DADS EDDADS DDEDRGQAQ 475 

I III : :::: I : 

Db 423 PET SDAVHTEQKPEEEKETLQEE 445 



RESULT 14 
ADK63670 

ID ADK63670 standard; protein; 445 AA. 
XX 

AC ADK63670; 
XX 

DT 06-MAY-2004 (first entry) 
XX 

DE Disease treating protein complex-derived protein #818. 
XX 

KW protein complex; drug target; diagnosis. 
XX 



OS Unidentified. 
XX 

PN EP1338608-A2. 
XX 

PD 27-AUG-2003. 
XX 

PF 20-DEC-2002; 2002EP-00102902 . 
XX 

PR 20-DEC-2001; 2001EP-00130253 . 
XX 

PA (CELL-) CELLZOME AG. 
XX 

PI Bauer A, Gavin A, Superti-Furga G, Kuester B, Schultz J; 

PI Marzioch M, Grandi P, Krause R, Kruse U, Merino A, Bauch A; 

PI Michon A, Leutwein C, Rick J; 

XX 

DR WPI; 2003-638460/61. 

DR N-PSDB; ADK63671. 
XX 

PT New proteins and protein complexes from eukaryotes, useful as targets in 

PT drug screening, or in diagnosing or screening for the presence of a 

PT disease or disorder, or a predisposition for developing a disease or 

PT disorder in a subject. 
XX 

PS Disclosure; SEQ ID NO 1635; 13pp; English. 
XX 

CC The invention relates to novel protein complexes comprising a first and a 

CC second protein, or its derivative, fragment, homologue or variant. The 

CC proteins are selected from given protein complexes, which are not defined 

CC in the specification. The variants are encoded by nucleic acids that 

CC hybridize to the nucleic acids encoding the proteins under low stringency 

CC conditions. The protein complexes are useful as targets for an active 

CC agent of a pharmaceutical. These protein complexes are particularly 

CC useful as drugs targets for the treatment or preventing of a disease or 

CC disorder. The complexes and methods above are useful in diagnosing or 

CC screening for the presence of a disease or disorder or a predisposition 

CC for developing a disease or disorder in a subject. These are also useful 

CC in screening for a drug for treatment or prevention of a disease or 

CC disorder. The molecule that modulates the amount, activity or protein 

CC components of the complex is useful for the manufacture of a medicament 

CC for the treatment or prevention of a disease or disorder. This sequence 

CC corresponds to a protein of the invention. (Note: the sequence data for 

CC this patent did not form part of the printed specification but was 

CC obtained from the EPO in electronic format) . 

XX 

SQ Sequence 445 AA; 

Query Match 9.2%; Score 253; DB 7; Length 445; 
Best Local Similarity 22.1%; Pred. No. 3.6e-10; 

Matches 112; Conservative 96; Mismatches 184; Indels 114; Gaps 21; 

Qy 23 LPERSGWCRVKYCNSLPDI PFDPKFITYP FDQNRFVQYKAT S LEKQHK 71 

: :: : :|| MM II : II : ::: Ml :| 

Db 1 MSKKQEYIAPIKYQNSLPVPQLPPKLLVYPESPETNADSSQLINSLYIKTNVTNLIQQ— 58 

Qy 72 HDLLTEPDLGVTIDLI NPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKRS 121 

: I I I : : M : : I I I I I I I I I : I : : 



Db 


59 


DEDLGMPVDLMKFPGLLNKLDSKLLYGFD-NVKLDKDDRILLRD PRIDRLT 


108 


Qy 


122 


QQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQ 

: 1 : : 1 : 1 1 1 : t : : : | : : : III 
KTDISKVTFLRRTEYVSNTIAAHDNTSLKRKRRL DDGDSDDENLDV 


175 


Db 


109 


154 


Qy 


176 


ITAIEKTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDT 

1 : : 1 1 1 1 Mill: : 1 1 III 
NHIISRVEGTFNKTDK — WQHPVKKGVKMVKKWDLLPD TASMDQVYF ILKF 


232 


Db 


155 


203 


Qy 


233 


SGAAALEMMSQAMIR-GM MDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYD 

| : | : I : : : I : : : 1 : : : : : : : : : | : | | I I : : 
MGSASLDTKEKKSLNTGIFRPVELEEDEWISMYATDHKDSAILENELEKGMDEMDDDSHE 


288 


Db 


204 


263 


Qy 


289 


YKIAREYNWNVKNKASKGYEENYFFIFREGDGV-YYNELETRVRLSKRRAKAGVQSG 

II I : : : : 1 II 1 : 1 1 : 1 1 | :: : I : I I : : 
GKIYKFKRIRDYDMKQVAEKPMTE-LAIRLNDKDGIAYYKPLRSKIELRRRRVNDIIKP- 


344 


Db 


264 


321 


Qy 


345 


TNALLWKH RDMNEKELEAQEARKAQLEN HEPEEEEEEEMETEEK 

I I : I 1 : : 1 1 : : : : : | : : | : | | : | : I 
LVKEHDIDQLNVTLRNPSTKEANIRDKLRMKFDPINFATVDEEDDEDEEQPEDVKK 


389 


Db 


322 


377 


Qy 


390 


EAGGSDEEQEKGSSSEKEGSEDEHSGSESEREEGDRDEASDKSGSGEDESSEDEARAARD 

1 1 II II II III. . a 1 • • 1 1 1 1 1 

1 : 1 : : : | 1 II : 1 1 III: : : 1 : : 1 1 1 1 1 
ESEG — DSKTEGSEQEGENEKDEEIKQEKENEQ DEENKQDENRAADT 


449 


Db 


378 


422 


Qy 


450 


KEEIFGSDADSEDDADSDDEDRGQAQ 475 




Db 


423 


1 III : :::: I : 
PET SDAVHTEQKPEEEKETLQEE 445 





RESULT 15 
ABU42513 

ID ABU42513 standard; protein; 1633 AA. 
XX 

AC ABU42513; 
XX 

DT 19-JUN-2003 (first entry) 
XX 

DE Protein encoded by Prokaryotic essential gene #28040. 
XX 

KW Antisense; prokaryotic essential gene; cell proliferation; drug design. 
XX 

OS Staphylococcus epidermidis . 
XX 

PN WO200277183-A2. 
XX 

PD 03-OCT-2002. 
XX 

PF 21-MAR-2002; 2002WO-US009107 . 
XX 

PR 21-MAR-2001; 2001US-00815242 . 

PR 06-SEP-2001; 2001US-00948993 . 

PR 25-OCT-2001; 2001US-0342923P . 

PR 08-FEB-2002; 2002US-00072851 . 

PR 06-MAR-2002; 2002US-0362699P . 
XX 



PA (ELIT-) ELITRA PHARM INC. 
XX 

PI Wang L, Zamudio C, Malone C, Haselbeck R, Ohlsen KL, Zyskind JW; 

PI Wall D, Trawick JD, Carr GJ, Yamamoto R, Forsyth RA f Xu HH; 

XX 

DR WPI; 2003-029926/02. 

DR N-PSDB; ACA46383. 
XX 

PT New antisense nucleic acids, useful for identifying proteins or screening 

PT for homologous nucleic acids required for cellular proliferation to 

PT isolate candidate molecules for rational drug discovery programs. 
XX 

PS Claim 25; SEQ ID NO 70437; 1766pp; English. 
XX 

CC The invention relates to an isolated nucleic acid comprising any one of 

CC the 6213 antisense sequences given in the specification where expression 

CC of the nucleic acid inhibits proliferation of a cell. Also included are: 

CC (1) a vector comprising a promoter operably linked to the nucleic acid 

CC encoding a polypeptide whose expression is inhibited by the antisense 

CC nucleic acid; (2) a host cell containing the vector; (3) an isolated 

CC polypeptide or its fragment whose expression is inhibited by the 

CC antisense nucleic acid; (4) an antibody capable of specifically binding 

CC the polypeptide; (5) producing the polypeptide; (6) inhibiting cellular 

CC proliferation or the activity of a gene in an operon required for 

CC proliferation; (7) identifying a compound that influences the activity of 

CC the gene product or that has an activity against a biological pathway 

CC required for prolif eration, or that inhibits cellular proliferation; (8) 

CC identifying a gene required for cellular proliferation or the biological 

CC pathway in which a proliferation-required gene or its gene product lies 

CC or a gene on which the test compound that inhibits proliferation of an 

CC organism acts; (9) manufacturing an antibiotic; (10) profiling a 

CC compound's activity; (11) a culture comprising strains in which the gene 

CC product is overexpressed or underexpressed; (12) determining the extent 

CC to which each of the strains is present in a culture or collection of 

CC strains; or (13) identifying the target of a compound that inhibits the 

CC proliferation of an organism. The antisense nucleic acids are useful for 

CC identifying proteins or screening for homologous nucleic acids required 

CC for cellular proliferation to isolate candidate molecules for rational 

CC drug discovery programs, or for screening homologous nucleic acids 

CC required for proliferation in cells other than S. aureus, S. typhimurium, 

CC K. pneumoniae or P. aeruginosa. The present sequence is encoded by one of 

CC the target prokaryotic essential genes. Note: The sequence data for this 

CC patent did not form part of the printed specification, but was obtained 

CC in electronic format directly from WIPO at 

CC ftp . wipo . int/pub/published_pct_sequences 
XX 

SQ Sequence 1633 AA; 

Query Match 8.6%; Score 237.5; DB 6; Length 1633; 

Best Local Similarity 21.2%; Pred. No. 2.8e-08; 

Matches 131; Conservative 85; Mismatches 238; Indels 165; Gaps 25; 

Qy 48 FITYPFDQNRFVQYKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDP NVLLDP 102 

: : I I I : I I : : I I I : : III I I : I 

Db 707 YVTLKDSNNRELQRVTTDQSGHYQFDNLQNGT — YTVEFAIPDNYTPSPANNSTNDAIDS 764 



Qy 103 ADEKLLEEEIQAPTSSKRSQQHAKV VPWMRKTEYISTEFNRYGI — SNEKPEVK 154 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 



| : : : : : : | | : | : : I : I I III 

765 DGERDGTRKVWAKGTINNADNMTVDTGFYLTPKYNVGDYVWEDTNKDGIQDDNEKGISN 824 



-QFTEEEIY- 



-KDRDSQ 175 



155 IGVSVKQ 

: |::| : I II : :ll 

825 VKVTLKNKNGDTIGTTTTDSNGKYEFTGLENGDYTIEFETPEGYTPTKQNSGSDEGKDSN 884 

176 I TAI E KT FE D A- QKSISQHYS KP RVT P VEVM P VFP D FKMW I N P C AQ VI FD S D P 227 

I I : I I I : I : I I I : : I : II 

885 GTKTTVTVKDADNKTIDSGFYKPIYN LGDY- WEDTNKDGIQDDSEKGI SGVK 936 

228 -APKDTSGAA ALEMMSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDY- 281 

I I : I I : : I : I I : I : I II I : : I 

937 VTLKDKNGNAI GTTTTDASGHYQFKGL — ENGSYTVEFETPSGYTPTKANSGQDITVDSN 994 



282 



307 



APDDVYD YKIAR EYNWNVKNK AS KGY 

I : I II: : I I II II 
995 GITTTGIINGADNLTIDSGFYKTPKYSVGDYVWEDTNKDGIQDDNEKGISGVKVTLKDEK 1054 

308 EENYFFIFREGD-GVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMN 357 

: I I : I III I : : : I : I : : 

1055 GNI I STTTTDENGKYQFDNLDSGNYI IHFEKPEGMTQTTANSG NDD 1100 

358 EKELEAQEARKAQLENH EPEEEEEEEMETEEKEAGGSD 395 

| | : : : : I : : | :::::::: | I 

1101 EKDADGEDVR-VTITDHDDFSIDNGYFDDDSDSDSDADSDSDSDSDSDADSDSDADSNSD 1159 

396 EEQEKGSSSEKEGSEDEHSGSESERE-EGDRDEASDK-SGSGEDESSEDEARAARDKEEI 453 

: : II:: I I I : I : : : I I II II I I : : : : I : 
1160 SDSDSDSDSDSDSDSDSDSDSDSDSDADSDSDADSDSDSDSDSDSDSDSDSDSDSDSDSD 1219 

454 FGSDADSEDDADSDDEDRGQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQE 513 

I I I I I : I I I I I : : I I : I I I I I : I I I I I I I : : 

1220 SDSDADSDSDADSDSDADSDSDADSDSDSDSDSDAD SDSDSDSDSDADSDSDSDSDS 1276 

514 DGSEAAASDS-SEADSDSD 531 

I : III I : I II I I I 

1277 DADSDSDSDSDS DADS DSD 1295 



Search completed: April 25, 2006, 09:07:10 
Job time : 194 sees 



GenCore version 5.1.7 
Copyright (c) 1993 - 2006 Biocceleration Ltd. 



OM protein - protein search, using sw model 
Run on: April 25, 2006, 09:04:03 



Search time 231 Seconds 

(without alignments) 

1621.799 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched: 



US-10-721-553-2 
2764 

1 MAPTIQTQAQREDGHRPNSH. 



. QEDGSEAAASDSSEADSDSD 531 



BLOSUM62 

Gapop 10.0 , Gapext 0.5 



2166443 seqs, 705528306 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



2166443 



Database 



UniProt_05.80:* 
1 : uniprot_sprot : * 
2: uniprot_trembl : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 



Result 
No. 


Score 


Query 
Match 


Length 


DB 


ID 




Description 


1 


2764 


100. 


0 


531 


2 


Q9H166_ 


HUMAN 


Q9hl66 


homo sapien 


2 


2744 


99. 


3 


531 


2 


Q9NUU9" 


HUMAN 


Q9nuu9 


homo sapien 


3 


2740 


99. 


1 


533 


2 


Q5RAX0" 


PONPY 


Q5rax0 


pongo pygma 


4 


2730.5 


98. 


8 


534 


2 


Q5RE77" 


PONPY 


Q5re77 


pongo pygma 


5 


2718 


98. 


3 


535 


2 


Q8K2T8" 


MOUSE 


Q8k2t8 


mus musculu 


6 


2708 


98. 


0 


535 


2 


Q9JJ99" 


_MOUSE 


Q9j j99 


mus musculu 


7 


2705 


97. 


9 


535 


2 


Q4V886" 


_RAT 


Q4v886 


rattus norv 


8 


2531.5 


91. 


6 


510 


2 


075239" 


"human 


075239 


homo sapien 


9 


2183.5 


79. 


0 


520 


2 


Q6P2Y1* 


"XENTR 


Q6p2yl 


xenopus tro 


10 


2030 


73. 


4 


503 


2 


Q4U0S5" 


"brare 


Q4u0s5 


brachydanio 


11 


1995 


72. 


2 


485 


2 


Q8N7H5" 


HUMAN 


Q8n7h5 


homo sapien 


- 12 


1984 


71. 


8 


377 


2 


Q9CS63" 


MOUSE 


Q9cs63 


mus musculu 


13 


1935 


70. 


0 


407 


2 


Q68F5f 


~XENLA 


Q68f51 


xenopus lae 


14 


1757.5 


63. 


6 


370 


2 


Q4RRR2" 


"tetng 


Q4rrr2 


tetraodon n 


15 


1244.5 


45. 


0 


538 


2 


Q9VN55" 


_DROME 


Q9vn55 


drosophila 



16 


1129 


a r\ 

40 . 


o 
O 


A C O 

4oo 


2 


Q7 PXA3_ANOGA 


Q7pxa3 


anopheles g 


17 


713 


25 . 


8 


A C *i 

453 


2 


Q60MA7 CAEBR 


Q60ma7 


caenorhabdi 


18 


645 


23 . 


3 


425 


2 


P90783_CAEEL 


P90783 


caenorhabdi 


19 


481.5 


17 . 


4 


499 


2 


Q55E33 DICDI 


Q55e33 


dictyosteli 


20 


372 . 5 


13 . 


5 


589 


2 


Q8RW91 ARATH 


Q8rw91 


arabidopsis 


21 


361.5 


13 . 


1 


593 


2 


Q9MA04_ARATH 


Q9ma04 


arabidopsis 


22 


358 . 5 


13 . 


0 


451 


2 


Q6ZD92_ORYSA 


Q6zd92 


oryza sativ 


23 


335 . 5 


12 . 


1 


C A "7 

547 


2 


Q 9 CA8 2_ARAT H 


Q9ca82 


arabidopsis 


24 


311 


11 . 


3 


703 


2 


Q4P5K3_USTMA 


Q4p5k3 


ustilago ma 


25 


286. 5 


10. 


4 


386 


2 


Q6C509__YARLI 


Q6c509 


yarrowia li 


26 


280. 5 


10 . 


1 


457 


2 


Q9US06_SCHPO 


Q9us06 


schizosacch 


27 


266. 5 


9 . 


6 


572 


2 


Q56PB7_RAT 


Q56pb7 


rattus norv 


28 


265 . 5 


9 . 


6 


791 


2 


Q9DGL1_FUGRU 


Q9dgll 


fugu rubrip 


29 


263. 5 


9 . 


5 


467 


2 


Q59Y36_CANAL 


Q59y36 


Candida alb 


30 


257 . 5 


9 . 


3 


571 


2 


Q8MTN7_TRISP 


Q8mtn7 


trichinella 


31 


253 


9 . 


2 


445 


1 


PAF1_ YEAST 


P38351 


saccharomyc 


32 


251.5 


9 . 


1 


538 


2 


Q9ET15_MOUSE 


Q9etl5 


mus musculu 


33 


250. 5 


9. 


1 


458 


2 


Q6BT93_DEBHA 


Q6bt93 


debar yomyce 


34 


249 


9. 


0 


440 


2 


Q55S74_CRYNE 


Q55s74 


cryptococcu 


35 


249 


9. 


0 


440 


2 


Q 5 KGM 6_C R YN E 


Q5kgm6 


cryptococcu 


36 


245.5 


8 . 


9 


784 


2 


Q7LZ90JTORCA 


Q71z90 


torpedo cal 


37 


245 


8 . 


9 


1848 


2 


Q7RGP8_PLAYO 


Q7rgp8 


Plasmodium 


38 


244 . 5 


8 . 


8 


934 


2 


Q9GMD3_BOVIN 


Q9gmd3 


bos taurus 


39 


243. 5 


8 . 


8 


438 


2 


Q6FXJ9_CANGA 


Q6fxj9 


Candida gla 


A f\ 

40 


O A 1 

241 


o 

O . 


/ 


*7 r\ o 

792 




Q9YTL7_9HERP 


Q9ytl7 


ateline her 


41 


240 


8. 


7 


1394 


1 


CNGBl_BOVIN 


Q28181 


bos taurus 


42 


239.5 


8. 


7 


544 


2 


Q56PC0 FELCA 


Q56pc0 


felis silve 


43 


238.5 


8. 


6 


1381 


2 


Q5HIB3_STAAC 


Q5hib3 


staphylococ 


44 


238 


8. 


6 


613 


2 


Q6UDM5_BRARE 


Q6udm5 


brachydanio 


45 


236.5 


8. 


6 


1633 


2 


Q8CMP4_STAEP 


Q8cmp4 


staphylococ 



ALIGNMENTS 



RESULT 1 
Q9H166_HUMAN 

ID Q9H166_HUMAN PRELIMINARY; PRT; 531 AA. 

AC Q9H166; 

DT 01-MAR-2001 (TrEMBLrel. 16, Created) 

DT 01-MAR-2001 (TrEMBLrel. 16, Last sequence update) 

DT 13-SEP-2005 (TrEMBLrel. 31, Last annotation update) 

DE Hypothetical protein PD2 . 

GN Name=PD2 ; 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Euarchontoglires ; Primates; Catarrhini; Hominidae; 

OC Homo . 

OX NCBI_TaxID=9606 ; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RA Batra S.K., Choudhury A., Keita M. , Schmied B.M., Vanlith M. , 

RA Walter N.A.R., Jokerst J., Sikela J.M., Metzgar R.S., 

RA Hollingsworth M.A. ; 

RL Submitted (JUN-2000) to the EMBL/ GenBank/DDBJ databases. 

RN [2] 

RP NUCLEOTIDE SEQUENCE. 



RC TISSUE=Muscle, and Placenta; 

RX MEDLINE=22388257; PubMed=12477932 ; DOI=10 . 1073/pnas . 242603899 ; 

RA Strausberg R.L., Feingold E.A. , Grouse L.H., Derge J.G., 

RA Klausner R. D. , Collins F.S., Wagner L., Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R. F. , Jordan H., Moore T . , Max S.I., Wang J. , Hsieh F., 

RA Diatchenko L., Marusina K., Farmer A. A., Rubin G.M., Hong L. , 

RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P.J., McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D. K. , Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A. , 

RA Fahey J. , Helton E. , Ketteman M., Madan A. , Rodrigues S., Sanchez A. , 

RA Whiting M. , Madan A. , Young A.C., Shevchenko Y., Bouffard G.G., 

RA Blakesley R.W., Touchman J.W., Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M., 

RA Butterfield Y.S.N., Krzywinski M.I., Skalska U., Smailus D.E., 

RA Schnerch A., Schein J.E., Jones S.J.M., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length human 

RT and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

RN [3] 

RP NUCLEOTIDE SEQUENCE. 

RC TISSUE=Muscle; 



RG NIH MGC Project; 

RL Submitted (AUG-2001) to the EMBL/ GenBank/DDBJ databases. 

RN [4] 

RP NUCLEOTIDE SEQUENCE. 



RC TISSUE=Placenta; 

RA Strausberg R. ; 

RL Submitted (NOV-2000) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AJ4 01 15 6; CAC20564.1; -; mRNA. 

DR EMBL; BC013402; AAH13402.1; -; mRNA. 

DR EMBL; BC000017; AAH00017.1; -; mRNA. 

DR Ensembl; ENSG00000006712 ; Homo sapiens. 

DR InterPro; IPR007133; Pafl. 

DR Pfam; PF03985; Pafl; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 531 AA; 59976 MW; 756F800AA64255D6 CRC64; 



Query Match 100.0%; Score 2764; DB 2; Length 531; 

Best Local Similarity 100.0%; Pred. No. 2.7e-122; 

Matches 531; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

Qy 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I t I I I I I I I I I I I II I I 
Db 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

Qy 121 SQQHAKWPWMRKTEYI STEFNRYGI SNEKPEVKI GVSVKQQFTEEEI YKDRDSQITAI E 180 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 SQQHAKWPWMRKTEYI STEFNRYGI SNEKPEVKI GVSVKQQFTEEEI YKDRDSQITAI E 180 



Qy 

Db 



181 
181 



KT FEDAQK S I SQH Y S K P RVT P VEVMPVF P D FKMW I N P CAQVI FD S D P AP KDT S GAAAL EM 240 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEM 24 0 



Qy 241 MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 300 

Qy 301 NKAS KGYEEN YFFI FREGDGVYYNELETRVRLS KRRAKAGVQS GTNALLWKHRDMNEKE 360 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 NKAS KGYEEN YFFI FREGDGVYYNELETRVRLS KRRAKAGVQS GTNALLWKHRDMNEKE 360 

Qy 361 LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 420 

Qy 421 EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 480 

I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 11 I I I I I I I I I I I 
Db 421 EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 480 

Qy 481 DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 



RESULT 2 
Q9NUU9_HUMAN 

ID Q9NUU9_HUMAN PRELIMINARY; PRT; 531 AA. 

AC Q9NUU9; 

DT 01-OCT-2000 (TrEMBLrel. 15, Created) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Hypothetical protein FLJ11123. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Euarchontoglires ; Primates; Catarrhini; Hominidae; 

OC Homo . 

OX NCBI_TaxID=9606; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RC TISSUE=Placenta; 

RX PubMed=14702039; DOI=10 . 1038/ngl285; 

RA Ota T., Suzuki Y., Nishikawa T., Otsuki T., Sugiyama T., Irie R. , 

RA Wakamatsu A., Hayashi K. , Sato H., Nagai K. , Kimura K. , Makita H., 

RA Sekine M. , Obayashi M. , Nishi T . , Shibahara T., Tanaka T., Ishii S., 

RA Yamamoto J. -I., Saito K., Kawai Y., Isono Y., Nakamura Y. , 

RA Nagahari K., Murakami K., Yasuda T., Iwayanagi T., Wagatsuma M., 

RA Shiratori A., Sudo H., Hosoiri T., Kaku Y., Kodaira H., Kondo H., 

RA Sugawara M. , Takahashi M. , Kanda K., Yokoi T., Furuya T., Kikkawa E., 

RA Omura Y. , Abe K. , Kamihara K., Katsuta N., Sato K. , Tanikawa M. , 

RA Yamazaki M. , Ninomiya K., Ishibashi T., Yamashita H., Murakawa K. , 

RA Fujimori K., Tanai H., Kimata M. , Watanabe M. , Hiraoka S., Chiba Y., 

RA Ishida S., Ono Y., Takiguchi S., Watanabe S., Yosida M., Hotuta T., 

RA Kusano J., Kanehori K. , Takahashi-Fujii A., Hara H., Tanase T.-O., 

RA Nomura Y., Togiya S., Komai F., Hara R. , Takeuchi K. , Arita M., 

RA Imose N., Musashino K. , Yuuki H., Oshima A., Sasaki N . , Aotsuka S., 

RA Yoshikawa Y. , Matsunawa H., Ichihara T., Shiohata N . , Sano S., 



RA Moriya S., Momiyama H., Satoh N., Takami S., Terashima Y., Suzuki 0., 

RA Nakagawa S., Senoh A. , Mizoguchi H., Goto Y. , Shimizu F. , Wakebe H., 

RA Hishigaki H., Watanabe T., Sugiyama A. , Takemoto M. , Kawakami B., 

RA Yamazaki M. , Watanabe K. , Kumagai A., Itakura S., Fukuzumi Y., 

RA Fujimori Y. , Komiyama M., Tashiro H., Tanigami A. , Fujiwara T., 

RA Ono T., Yamada K. , Fujii Y., Ozaki K. , Hirao M. , Ohmori Y., 

RA Kawabata A., Hikiji T., Kobatake N., Inagaki H., Ikema Y., Okamoto S., 

RA Okitani R. , Kawakami T., Noguchi S., Itoh T., Shigeta K. f Senba T., 

RA Matsumura K., Nakajima Y., Mizuno T . , Morinaga M. , Sasaki M. r 

RA Togashi T., Oyama M. , Hata H., Watanabe M. , Komatsu T., 

RA Mizushima-Sugano J. f Satoh T., Shirai Y., Takahashi Y., Nakagawa K. f 

RA Okumura K., Nagase T., Nomura N. f Kikuchi H., Masuho Y., Yamashita R. , 

RA Nakai K. , Yada T., Nakamura Y. , Ohara 0., Isogai T., Sugano S.; 

RT "Complete sequencing and characterization of 21,243 full-length human 

RT cDNAs . " ; 

RL Nat. Genet. 36:40-45(2004). 

DR EMBL; AK001985; BAA92020.1; -; mRNA. 

DR InterPro; IPR007133; Pafl. 

DR Pfam; PF03985; Pafl; 1. 

SQ SEQUENCE 531 AA; 59797 MW; 446D2588B20E42DC CRC64; 

Query Match 99.3%; Score 2744; DB 2; Length 531; 

Best Local Similarity 99.4%; Pred. No. 2.3e-121; 

Matches 528; Conservative 0; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRIVQ 60 

Qy 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

Qy 121 SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 180 

I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 121 SQQHAKWPWMRKTEYISTEFNRYGISNEKPGVKIGVSVKQQFTEEEIYKDRDSQITAIE 180 

Qy 181 KTFEDAQKSISQHYSKPRVTTPv^r^PVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEM 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 KTFEDAQKSISQHYSKPRVTPV^^PVFPDFKMWINPCAQVTFDSDPAPKDTSGAAALEM 240 

Qy 241 MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 MSQAMIRGMMDEEGNQFVAYFLPv^ETLKKRKRDQEEEMDYAPDDWDYKIAREYNw>P/K 300 

Qy 301 NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLVVKHRDMNEKE 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 360 

Qy 361 LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I II I 
Db 361 LEAQEARKAQLENHEPEGEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 420 



Qy 

Db 



421 
421 



EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II 
EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 



480 
480 



Qy 

Db 



481 
481 



DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 

I I I I I I I I I I I I I I I I I I I I I I I II I 1 I I I I I I I I I I I I I I I II I I I I I I I 
DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 



RESULT 3 
Q5RAX0 PONPY 



ID Q5RAX0_PONPY PRELIMINARY; PRT; 533 AA. 

AC Q5RAX0; 

DT 01-FEB-2005 (TrEMBLrel. 29, Created) 

DT 01-FEB-2005 (TrEMBLrel. 29, Last sequence update) 

DT 01-FEB-2005 (TrEMBLrel. 29, Last annotation update) 

DE Hypothetical protein DKFZp469K121 . 

GN Name=DKFZp4 69K12 1 ; 

OS Pongo pygmaeus (Orangutan) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Euarchontoglires ; Primates; Catarrhini; Hominidae; 

OC Pongo . 

OX NCBI_TaxID=9600; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RC TISSUE=Kidney; 

RG The German cDNA Consortium; 

RA Poustka A. , Albert R. , Moosmayer P. , Schupp I., Wellenreuther R. , 

RA Mewes H.W., Weil B. , Amid C, Osanger A. , Fobo G., Han M. , Wiemann S.; 

RL Submitted (NOV-2004) to the EMBL/ GenBank/DDB J databases. 

DR EMBL; CR858891; CAH91090.1; -; mRNA. 

DR InterPro; IPR007133; Pafl. 

DR Pfam; PF03985; Pafl; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 533 AA; 60248 MW; 5D01B4E99420F050 CRC64 ; 



Query Match 99.1%; Score 2740; DB 2; Length 533; 

Best Local Similarity 99.2%; Pred. No. 3.6e-121; 

Matches 529; Conservative 0; Mismatches 2; Indels 2; Gaps 1; 



Qy 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

Qy 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I 
Db 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVXLDPADEKLLEEEIQAPTSSKR 120 

Qy 121 SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 180 

I I I I I I I I I I I I I I I I I I I I I II | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 180 

Qy 181 KTFEDAQKSISQHYSKPRVTTPVEVMPVFPDFKMWINPCAQVI FDSDPAPKDTSGAAALEM 240 

I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVI FDSDPAPKDTSGAAALEM 240 

Qy 241 MSQAMIRGMMDEEGNQFVAYFLPV^ETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 300 

I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 241 MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 300 

Qy 301 NKASKGYEENYFFIFREGDGVTYNELETRVRLSKRRAKAGVQSGTNALLVVKHRDMNEKE 360 



II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 301 NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 360 

Qy 361 LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDERSGSESER 420 

Qy 421 EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 480 

I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 421 EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 480 

Qy 481 DSDSGSNGGGQRSRSH — SRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 4 81 DSDSGSNGGGQRSRSHSRSRSASPFPSGSEHSAQEDGSEAAAPDSSEADSDSD 533 

RESULT 4 
Q5RE77_PONPY 

ID Q5RE77_PONPY PRELIMINARY; PRT; 534 AA. 

AC Q5RE77; 

DT 01-FEB-2005 (TrEMBLrel . 29, Created) 

DT 01-FEB-2005 (TrEMBLrel. 29, Last sequence update) 

DT 01-FEB-2005 (TrEMBLrel. 29, Last annotation update) 

DE Hypothetical protein DKFZp468 J227 . 

GN Name=DKFZp468J227; 

OS Pongo pygmaeus (Orangutan) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Euarchontoglires ; Primates; Catarrhini; Hominidae; 

OC Pongo . 

OX NCBI_TaxID=9600; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RC TISSUE=Heart; 

RG The German cDNA Consortium; 

RA Ansorge W. , Krieger S., Regiert T., Rittmueller C, Schwager B., 

RA Mewes H.W., Weil B., Amid C, Osanger A., Fobo G., Han M. , Wiemann S.; 

RL Submitted (NOV-2004) to the EMBL/ GenBank/ DDB J databases. 

DR EMBL; CR857657; CAH89930.1; -; 

DR InterPro; IPR007133; Pafl. 

DR Pfam; PF03985; Pafl; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 534 AA; 60364 MW; E256C4EE1CA171FB CRC64; 

Query Match 98.8%; Score 2730.5; DB 2; Length 534; 

Best Local Similarity 99.1%; Pred. No. le-120; 

Matches 529; Conservative 0; Mismatches 2; Indels 3; Gaps 2; 

Qy 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

II I I || I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

Qy 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

Qy 121 SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I 



Db 


121 


SQQHAKWPWMRKTEYI STEFNRYGI SNEKPEVKIGVSVKQQFTEEEI YKDRDSQITAI E 


180 


n, ; 

yy 


101 


KTFPnanK'Q T^nHY c i^PRVTPVFVMPV'FPnF , KMMTMPranvT Trn^npaPvnTQ^ziZiaT fm 
x\ i r Uny A. o loyn iorvirr\.viir v ti v-riir vr rur xvriw c v ± r uo Jjr\rvir i\.u i o \jj\t\t\±i Hii v i 


o a n 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEM 


240 




OA 1 

ell 


MQ^2iMTD^'^/ns/^^T^I?r , ^Tn^^\/'nvT^T dwftt t^t^d adroit it TTMnvzi Dnrvv/vnvvT ADirvMTArNn/v 








1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


241 


MS QAMI RGMMDEEGNQ FVAYFL P VEET LKKRKRDQEEEMD YAP DDVYDYKI AREYNWNVK 


300 


Qy 


O U 1 










1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


301 


NKASKGYEENYFFIFREGDGVYYNELETRRVRLSKRRAKAGVQSGTNALLWKHRDMNEK 


360 


Qy 




Cj Jj H*rVv *-*AK iN-r\Sd J- 1 C SLCjCjCiEitLXLtV'lCi 1 LLJ\CiRbbOL'CiCiUllti\bOOOIj 


A 1 Q 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


361 


ELEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDERSGSESE 


420 


Qy 












I \ 1 I 1 1 I I I I I I I I I I I I I I 1 1 I 1 I 1 1 I I I 1 1 1 I I I I t 1 1 I I I I i I 1 1 I I I l I I I I I I I I 

II 1 II 1 1 1 I I I I 1 I I I I 1 1 I 1 1 1 1 I I I 1 || M 1 1 II II II 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 




Db 


421 


REEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSD 


480 


Qy 


480 


NDSDSGSNGGGQRSRSH— SRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 








1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


481 


NDSDSGSNGGGQRSRGHSRSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 534 





RESULT 5 
Q8K2T8_MOUSE 

ID Q8K2T8_MOUSE PRELIMINARY; PRT; 535 AA. 

AC Q8K2T8; 

DT 01-OCT-2002 (TrEMBLrel . 22, Created) 

DT 01-OCT-2002 (TrEMBLrel. 22, Last sequence update) 

DT 10-MAY-2005 (TrEMBLrel. 30, Last annotation update) 

DE RIKEN cDNA 5730511K23. 

GN Name=5730511K23Rik; 

OS Mus mus cuius (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Euarchontoglires ; Glires; Rodentia; Sciurognathi ; 

OC Muroidea; Muridae; Murinae; Mus. 

OX NCBIJTaxID=10090; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RC STRAIN=CZECH II, and C57BL/6; 

RC TISSUE=Brain, and 

RC Mammary tumor metastatized to lung. MMTV-LTR/Wntl model. Expression 

RC driven by an MMTV-LTR enhancer.; 

RX MEDLINE=22388257; PubMed=12477932 ; DOI=10 . 1073/pnas . 242603899; 

RA Strausberg R. L. , Feingold E.A. , Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L., Shenmen CM. , Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T., Max S.I., Wang J., Hsieh F. , 

RA Diatchenko L., Marusina K. , Farmer A. A., Rubin G.M., Hong L., 

RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P.J., McKernan K.J., Malek J. A., Gunaratne P.H., 



RA Richards S., Worley K.C., Hale S., Garcia A.M. , Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A., 

RA ■ Fahey J., Helton E. / Ketteman M. , Madan A. , Rodrigues S., Sanchez A., 

RA Whiting M. , Madan A. , Young A.C., Shevchenko Y., Bouffard G. G. , 

RA Blakesley R.W., Touchman J.W., Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M. , 

RA Butterfield Y.S.N. , Krzywinski M.I., Skalska U., Smailus D.E., 

RA Schnerch A., Schein J.E., Jones S.J.M., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length human 

RT and mouse cDNA sequences."'; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

RN [2] 

RP NUCLEOTIDE SEQUENCE. 

RC STRAIN=CZECH II; 

RC TISSUE=Mammary tumor metastatized to lung. MMTV-LTR/Wntl model. 

RC Expression driven by an MMTV-LTR enhancer.; 

RA Strausberg R. ; 

RL Submitted (MAY-2002) to the EMBL/GenBank/DDBJ databases. 

RN [3] 

RP NUCLEOTIDE SEQUENCE. 

RC STRAIN=C57BL/6; TISSUE=Brain; 

RA Director MGC Project; 

RL Submitted (OCT-2004) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; BC029843; AAH29843.1; -; mRNA. 

DR EMBL; BC083337; AAH83337.1; -; mRNA. 

DR Ensembl; ENSMUSG00000003437 ; Mus musculus. 

DR MGI; MGI: 1923988; 5730511K23Rik . 

DR InterPro; IPR007133; Pafl. 

DR Pfam; PF03985; Pafl; 1. 

SQ SEQUENCE 535 AA; 60518 MW; 7A5EAB1284 98 8070 CRC64; 



Query Match 98.3%; Score 2718; DB 2; Length 535; 

Best Local Similarity 98.1%; Pred. No. 3.9e-120; 

Matches 525; Conservative 2; Mismatches 4; Indels 4; Gaps 1; 



Qy 


i 


MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 


60 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 II 




Db 


i 


MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 


60 


Qy 


61 


YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 


120 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 




Db 


61 


YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 


120 


Qy 


121 


SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 


180 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 III 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 




Db 


121 


SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 


180 


Qy 


181 


KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEM 


240 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


KTFEDAQKSISQHYSKPRVTPV^VMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEM 


240 


Qy 


241 


MSQAMIRGMMDEEGNQFVAYFLPVTIETLKKRKRDQEEEMDYAPDDWDYKIAREYNWNVK 


300 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


241 


MSQAMIRGMMDEEGNQFVAYFLPV^ETLKKRKRDQEEEMDYAPDDWDYKIAREYNWNVK 


300 


Qy 


301 


NKASKGYEENYFFIFREGDGWYNELETRVT^LSKRRAKAGVQSGTNALLVVKHRDMNEKE 


360 



1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 



Db 



301 



NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 360 



Qy 


361 


LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 


420 






i i i i i i i i i i i i i i i i i i t i i i i i i i i i i i i i i < i i i i i i i i i i i i i i i i i i i i i i i i 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 




Db 


361 


LEAQEARKAQLENHEPEEEEEEEMEAEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESDR 


420 


Qy 


421 


EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 


480 






1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


421 


EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAHRGSDN 


480 


Qy 


481 


DSDSGSNGGGQR SRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 








1 1 1 1 1 1 : 1 1 1 1 1 III 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 




Db 


481 


DSDSGSDGGGQRSRSQSRSRSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 535 




RESULT 


6 






Q9JJ99_ 


MOUSE 







ID Q9JJ99_MOUSE PRELIMINARY; PRT; 535 AA. 

AC Q9JJ99; 

DT 01-OCT-2000 (TrEMBLrel. 15, Created) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Mus musculus brain cDNA, clone MNCb-6444, similar to Homo sapiens cDNA 

DE FLJ11123, clone PLACE1006167 . 

GN Name=5730511K23Rik; 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Euarchontoglires ; Glires; Rodentia; Sciurognathi; 

OC Muroidea; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RC STRAIN=C57BL; 

RA Osada N., Kusuda J., Tanuma R. , Ito A. , Hirata M. , Sugano S., 

RA Hashimoto K. ; 

RL Submitted (APR-2000) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; AB041615; BAA95098.1; -; mRNA. 

DR Ensembl; ENSMUSG00000003437; Mus musculus. 

DR MGI; MGI: 1923988; 5730511K23Rik . 

DR InterPro; IPR007133; Pafl. 

DR Pfam; PF03985; Pafl; 1. 

SQ SEQUENCE 535 AA; 60534 MW; 6D7EEB1ECDC8C075 CRC64; 



Query Match 98.0%; Score 2708; DB 2; Length 535; 

Best Local Similarity 97.9%; Pred. No. 1.2e-119; 

Matches 524; Conservative 2; Mismatches 5; Indels 4; Gaps 1; 

Qy 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

Qy 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

Qy 121 SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 180 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 



121 SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 180 



Qy 181 KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEM 240 

I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDLAPKDTSGAAALEM 240 

Qy 241 MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 300 

II I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDWDYKIAREYNWNVK 300 

Qy 301 NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I 
Db 301 NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 360 

Qy 361 LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 1 1 I I I I I I I I I I : I 
Db 361 LEAQEARKAQLENHEPEEEEEEEMEAEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESDR 420 

Qy 421 EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAHRGSDN 480 

Qy 481 DSDSGSNGGGQR SRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 

I I I I I I : I II I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 481 DSDSGSDGGGQRSRSQSRSRSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 535 



RESULT 7 
Q4V886_RAT 

ID Q4V886_RAT PRELIMINARY; PRT; 535 AA. 

AC Q4V886; 

DT 13-SEP-2005 (TrEMBLrel. 31, Created) 

DT 13-SEP-2005 (TrEMBLrel. 31, Last sequence update) 

DT 13-SEP-2005 (TrEMBLrel. 31, Last annotation update) 

DE Hypothetical protein RGD1306219_predicted. 

GN Name=RGDl 30621 9_pr edi cted ; 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Euarchontoglires ; Glires; Rodentia; Sciurognathi; 

OC Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RC TISSUE=Placenta; 

RX MEDLINE=22388257; PubMed=12477932 ; DOI=10 . 1073/pnas . 242603899 ; 

RA Strausberg R.L., Feingold E.A. , Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L., Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B. , Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T., Max S.I., Wang J., Hsieh F., 

RA Diatchenko L., Marusina K. , Farmer A. A. , Rubin G.M., Hong L . , 

RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P. J., McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A. , 

RA Fahey J., Helton E., Ketteman M., Madan A., Rodrigues S., Sanchez A., 



RA Whiting M. , Madan A., Young A.C., Shevchenko Y. , Bouffard G.G., 

RA Blakesley R.W;, Touchman J.W., Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M., 

RA Butterfield Y.S.N. , Krzywinski M.I., Skalska U., Smailus D.E., 

RA Schnerch A., Schein J.E., Jones S.J.M., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length human 

RT and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

RN [2] 

RP NUCLEOTIDE SEQUENCE. 

RC TISSUE=Placenta; 

RG NIH MGC Project; 

RL Submitted (JUN-2005) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; BC097494; AAH97494.1; -; mRNA. 

KW Hypothetical protein. 

SQ SEQUENCE 535 AA; 60546 MW; 48432E1DA398806F CRC64; 

Query Match 97.9%; Score 2705; DB 2; Length 535; 

Best Local Similarity 97.8%; Pred. No. 1.6e-119; 

Matches 523; Conservative 2; Mismatches 6; Indels 4; Gaps 1; 

Qy 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

Qy 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

Qy 121 SQQHAKWPWMRKTEYI STEFNRYGI SNEKPEVKI GVSVKQQFTEEEI YKDRDSQITAI E 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 121 SQQHAKVVPWMRKTEYI STEFNRYGI SNEKPEVKI GVSVKQQFTEEEI YKDRDSQITAI E 180 

Qy 181 KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEM 240 

I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 KTFEDAQKSISQHYSKPRVTPV^WIPVFPDFKMWINPCAQVIFDSDPAPKDTSGAA7VLEM 240 

Qy 241 MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 300 

I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 300 

Qy 301 NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 360 

Qy 361 LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 420 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I II : I 
Db 361 LEAQEARKAQLENHEPEEEEEEEMEAEEKEAGGSDEEHEKGSSSEKEGSEDERSGSESDR 420 

Qy 421 EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 421 EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAHRGSDN 480 

Qy 481 DSDSGSNGGGQR SRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 

I I I I I I : I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 DSDSGSDGGGQRSRSQSRSRSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 535 



RESULT 8 
075239 HUMAN 



ID 075239_HUMAN PRELIMINARY; PRT; 510 AA. 

AC 075239; 

DT 01-NOV-1998 (TrEMBLrel. 08, Created) 

DT 01-NOV-1998 (TrEMBLrel. 08, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE F23149_l. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Euarchontoglires ; Primates; Catarrhini; Hominidae; 

OC Homo . 

OX NCBIJTaxID=9606; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RA Lamerdin J.E., McCready P.M., Skowronski E., Adamson A.W., 

RA Burkhart-Schultz K. , Gordon L., Kyle A., Ramirez M., Stilwagen S., 

RA Phan H., Velasco N . , Do L., Regala W., Terry A., Games J., 

RA Danganan L., Poundstone P., Christensen M. , Georgescu A., Avila J., 

RA Liu S., Attix C, Andreise T. f Trankheim M. , Amico-Keller G. , 

RA Coefield J., Duarte S., Lucas S-, Bruce R. , Thomas P., Quan G. , 

RA Kronmiller B., Arellano A., Montgomery M., Ow D., Nolan M. , Trong S., 

RA Kobayashi A., Olsen A.S., Carrano A.V. ; 

RL Submitted (JUL-1998) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AC005239; AAC25503.1; -; Genomic_DNA. 

DR Ensembl; ENSG00000006712 ; Homo sapiens. 

DR InterPro; IPR007133; Pafl. 

DR Pfam; PF03985; Pafl; 1. 

SQ SEQUENCE 510 AA; 57466 MW; CACE73EDC7290CE8 CRC64; 



Query Match 91.6%; Score 2531.5; DB 2; Length 510; 

Best Local Similarity 91.7%; Pred. No. 2.2e-lll; 

Matches 4 98; Conservative 0; Mismatches 0; Indels 45; Gaps 3; 

Qy 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 MAPTIQTQAQREDGH RSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 50 

Qy 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 II I I I I 
Db 51 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 110 

Qy 121 SQQHAKWPWMRKTEYI STEFNRYGI SNEKPEVKI GVSVKQQFTEEEI YKDRDSQITAI E 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I' I I I II I I I I I I I I I I 
Db 111 SQQHAKWPWMRKTEYI STEFNRYGI SNEKPEVKI GVSVKQQFTEEEI YKDRDSQITAI E 170 

Qy 181 KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEM 240 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

Db 171 KTFEDAQKS MWINPCAQVI FDSDPAPKDTSGAAALEM 207 

Qy 241 MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEmDYAPDDWDYKIAREYNWNVK 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 208 MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 267 

Qy 301 .NKASKGYEENYFFI FREGDGVYYNELETR VRL S KRRAKAGVQ S GTNAL 348 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II II I I I I I I I I I I I I II 



Db 



268 NKASKGYEENYFFIFREGDGVYYNELETRYSAHSYLISLDLVRLSKRRAKAGVQSGTNAL 327 



Qy 349 LWKHRDMNEKELEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEG 408 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 328 LWKHRDMNEKELEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEG 387 

Qy 409 SEDEHSGSESEREEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDD 468 

I I I I I I I I I I I I I I I I I I I I I I I I U I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 388 SEDEHSGSESEREEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDD 447 

Qy 469 EDRGQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADS 528 

I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 448 EDRGQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADS 507 

Qy 529 DSD 531 

III ' 
Db 508 DSD 510 



RESULT 9 
Q6P2Y1_XENTR 

ID Q6P2Y1_XENTR PRELIMINARY; PRT; 520 AA. 

AC Q6P2Y1; 

DT 05-JUL-2004 (TrEMBLrel. 27, Created) 

DT 05-JUL-2004 (TrEMBLrel. 27, Last sequence update) 

DT 05-JUL-2004 (TrEMBLrel. 27, Last annotation update) 

DE Hypothetical protein MGC76249. 

GN Name=MGC76249; 

OS Xenopus tropicalis (Western clawed frog) (Silurana tropicalis). 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Amphibia; Batrachia; Anura; Mesobatrachia; Pipoidea; Pipidae; 

OC Xenopodinae; Xenopus; Silurana. 

OX NCBI_TaxID=8364; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RC TISSUE=Embryo; 

RX MEDLINE=22388257; PubMed=12477932 ; DOI=10 . 1073/pnas . 242603899 ; 

RA Strausberg R.L., Feingold E.A. , Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L., Shenmen CM. , Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T., Max S.I., Wang J., Hsieh F., 

RA Diatchenko L. , Marusina K., Farmer A. A. , Rubin G.M., Hong L., 

RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P. J., McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A. , 

RA Fahey J., Helton E. , Ketteman M. , Madan A., Rodrigues S., Sanchez A., 

RA Whiting M. , Madan A., Young A.C., Shevchenko Y., Bouffard G.G., 

RA Blakesley R.W., Touchman J.W., Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M., 

RA Butterfield Y.S.N. , Krzywinski M.I., Skalska U., Smailus D.E., 

RA Schnerch A., Schein J.E., Jones S.J.M., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length human 

RT and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 



RN [2] 

RP NUCLEOTIDE SEQUENCE. 

RC TISSUE=Embryo; 

RA Klein S., Gerhard D.S.; 

RL Submitted (DEC-2003) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; BC064253; AAH64253.1; mRNA. 

DR InterPro; IPR007133; Pafl. 

DR Pfam; PF03985; Pafl; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 520 AA; 59064 MW; 76D526C423C459A7 CRC64; 



Query Match 79.0%; Score 2183.5; DB 2; Length 520; 

Best Local Similarity 80.2%; Pred. No. 5.1e-95; 

Matches 429; Conservative 40; Mismatches 45; Indels 21; Gaps 8; 



Qy 


i 


MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNREVQ 


60 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 I 1 1 : 1 1 1 1 1 1 II 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


i 


MAPTIQTQAQREDGHRSSSHRTVPERSGWCRVKYCNTLPDIPFDPKFITYPFDQNRFVQ 


60 


Qy 


61 


YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 


120 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 




Db 


61 


YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVTLDIADEKLLEEEIQAPSSSKR 


120 


Qv 


121 


SQQHAKWPV^RKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 


180 






1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 : 1 1 1 




Db 


121 


SQQHAKWPWMRKTEYISTEFNRYGVSNEKPEVKIGVSVKQQFTEEDIYKDRDSQISAIE 


180 


Qy 


181 


KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEM 


240 






1 1 1 1 1 II 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 : 1 




Db 


181 


KTFEDAQKPISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDASGSAALDM 


240 


Qy 


241 


MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 


300 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 :: 1 1 1 1 1 1 1 1 : 1 1 1 : 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


241 


MSQAMIRGMMDEEGNQFVAYFLPGEETMRKRKRDQEEGLDYMPEDIYDYKIAREYNWNVK 


300 


Qy 


301 


NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 


360 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 : 1 1 1 




Db 


301 


NKASKGYEENYFFIFREGDGVYYNELETRVKLSKRRVlCAGVQSGTNAVXvVKHRDMHEKE 


360 


Qy 


361 


LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSD-EEQEKGSSSEKEGSEDEHSGSESE 


419 






1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : :: 1 III 1 : 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 




Db 


361 


LEAQEARRAQLENHEPEEEEEI EV DQETQGSDAEDGEKGSGSEKEGSGAEQSGSESE 


417 


Qy 


420 


REEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSD 


479 


Db 


418 


1 1 1 : : 1 1 : 1 : : 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 : 1 1 1 : 1 1 
REEAEEEEKEDE EEKES S EEDRAARDKEEI FGS D DDDSDED GPNESGQD 


466 


Qy 


480 


NDSDSGSNG GGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 





: I I I I : II lllhlll I I : I I I I : :: I : I : : I I 

Db 467 GE-DSGSDDEEEKGQGRRSRSASSSPF — GSDHSQQENEDQSASDQGSGSSTGSD 518 



RESULT 10 
Q4U0S5_BRARE 

ID Q4U0S5_BRARE PRELIMINARY; PRT; 503 AA. 

AC Q4U0S5; 

DT 13-SEP-2005 (TrEMBLrel. 31, Created) 



DT 13-SEP-2005 (TrEMBLrel. 31, Last sequence update) 

DT 13-SEP-2005 (TrEMBLrel. 31, Last annotation update) 

DE PD2-like protein. 

OS Brachydanio rerio (Zebrafish). (Danio rerio) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Actinopterygii; Neopterygii; Teleostei; Ostariophysi; Cyprinif ormes ; 

OC Cyprinidae; Danio. 

OX NCBIJTax I D=7 9 5 5 ; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RA Amsterdam A. , Hopkins N. ; 

RT "Danio rerio PD2-like mRNA . " ; 

RL Submitted (APR-2005) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; DQ022213; AAY44602.1; -; mRNA. 

SQ SEQUENCE 503 AA; 58216 MW; B8EE86A45B9D9DEE CRC64; 

Query Match 73.4%; Score 2030; DB 2; Length 503; 

Best Local Similarity 74.2%; Pred. No. 8.2e-88; 

Matches 402; Conservative 40; Mismatches 50; Indels 50; Gaps 8; 

MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 
I I I I I I I I I I I I I I I I : : I I I : I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I : I I I I 
MAPTIQTQAQREDGHRSSAHRTVPERSGWCRVKYGNSLPDIPFDPKFITYPFDQHREVQ 60 

YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 
I I I I I I I I I I I I : I I II I I I I I I I I II I I I I I I I I I I : I I I I I I I I I I I I I I I I I : I I I I 
YKATSLEKQHKHELLTEPDLGVTIDLINPDTYRIDPNILLDPADEKLLEEEIQAPSSSKR 120 

SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 180 
I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 
SQQHAKWPWMRKTEYISTEFNRYGVSNEKV^VKIGVSVKQQFTEEEIYKDRDSQIAAIE 180 

KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVI FDSDPAPKDTSGAAALEM 240 
I I I I I I I II I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I : : I 
KTFEDAQKSISQHYSKPRVTPVEVLPVFPDFKMWINPCAQVI FDSDPAPKDVSAPAGVDM 24 0 

MSQAMIRGMMDEEGNQFVAYFLPV^ETLKKRKRDQEEEMDYAPDDWDYKIAKEYNWNVK 300 
I I I I I I I I I I I I I I I I I I I I I I I I : I : : I I I I I I I I : I I I : : I I : I I I I I I I I I I I I 
MSQAMIRGMMDEEGNQEVAYFLPNEDTMRKRKRDVT1EELDYMPEEVTEYKIAREYNWNVK 300 

NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 360 
I I I I I I I I I I I I I I I I : I I II II I I I I I I I I I I I I I I I II llhll I I I I I I I I I 
N KAS KG YE EN Y F FI FRDADGVYYN ELET RVRL S KRRAKVGAQ S S TN AVLVC KHRDMN EKE 360 

LEAQEARKAQLENHEPEEEEEE-EMETEEKEAGGSDEEQ EKGSSSEKEGSEDEHSGS 416 

I I I I I I I I I I I I I I I I I : I I I I :: I :: I I : I : : I I I I I : I 
LEAQEARKAQLENHEPEDEEEELDLEKDMQEDSGEEREKPSDSENSESESEREEEERPAD 420 

ESEREEGD RDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDE 469 

I I I I I I : I I I I : I I I I I : I I I I II I I I I : : : : : I 

EDEEEEEDEESVKRRRERKSSGSESGDD RQARDEEEIFGSDDDSEEEEEEEEE 473 

DRGQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSD 529 

11111:11 : I I I I llllhlll 
GGARRRSN S S S V QHSASE RASDSSDA-SD 501 

SD 531 



Qy 


i 


Db 


i 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


417 


Db 


421 


Qy 


470 


Db 


474 


Qy 


530 



Db 502 SD 503 

RESULT 11 
Q8N7H5_HUMAN 



ID Q8N7H5_HUMAN PRELIMINARY; PRT; 485 AA. 

AC Q8N7H5; 

DT 01-OCT-2002 (TrEMBLrel. 22, Created) 

DT 01-OCT-2002 (TrEMBLrel. 22, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Hypothetical protein FLJ25557. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Euarchontoglires ; Primates; Catarrhini; Hominidae; 

OC Homo . 

OX NCBI_TaxID=9606; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RC TISSUE=Thyroid; 

RA Ninomiya K., Wagatsuma M. , Kanda K. , Kondo H., Yokoi T., Kodaira H., 

RA Furuya T., Takahashi M. , Kikkawa E., Omura Y. , Abe K. , Kamihara K. , 

RA Katsuta N., Sato K., Tanikawa M. , Yamazaki M. , Suzuki Y., Hata H., 

RA Nakagawa K., Mizuno S., Morinaga M. , Kawamura M. , Sugiyama T., 

RA Irie R. , Otsuki T., Sato H., Nishikawa T., Sugiyama A., Kawakami B., 

RA Nagai K., Isogai T., Sugano S.; 

RL Submitted (JUL-2002) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AK098423; BAC05305.1; -; mRNA. 

DR Ensembl; ENSG00000006712 ; Homo sapiens. 

DR InterPro; IPR007133; Pafl. 

DR Pfam; PF03985; Pafl; 1. 

SQ SEQUENCE 485 AA; 55501 MW; 5F4A1ACC99142C1D CRC64; 



Query Match 72.2%; Score 1995; DB 2; Length 485; 

Best Local Similarity 97.2%; Pred. No. 3.5e-86; 

Matches 383; Conservative 1; Mismatches 0; Indels 10; Gaps 1; 



Qy 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 

Db 1 MAPTIQTQAQREDGH RSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 50 

Qy 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I 

Db 51 YKATSLERQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 110 

Qy 121 SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 180 

I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 111 SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 170 

Qy 181 KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVI FDSDPAPKDTSGAAALEM 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 171 KTFEDAQKS I SQHYSKPRVTPVEVMPVFPDFKMWINPCAQVT FDSDPAPKDTSGAAALEM 230 

Qy 241 MS QAMI RGMMDEEGNQ FVAYFLPVEET LKKRKRDQEEEMD YAP DDVYD YKI ARE YNWNVK 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 

Db 231 MSQAMIRGMMDEEGNQFVAYFLPV^ETLKKRKRDQEEEMDYAPDDWDYKIAREYNWNVK 290 



Qy 301 NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 291 NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 350 

Qy 361 LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGS 394 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 351 LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGS 384 



RESULT 12 
Q9CS63_MOUSE 

ID Q9CS63_MOUSE PRELIMINARY; PRT; 377 AA. 

AC Q9CS63; 

DT 01-JUN-2001 (TrEMBLrel. 17, Created) 

DT 01-JUN-2001 (TrEMBLrel. 17, Last sequence update) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 

DE Mus mus cuius 8 days embryo whole body cDNA, RIKEN full-length enriched 

DE library, clone: 5730511K23 product :PD2 PROTEIN (HYPOTHETICAL 60.0 kDa 

DE PROTEIN) homolog (Fragment) . 

GN Name=5730511K23Rik; 

OS Mus mus cuius (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Euarchontoglires ; Glires; Rodentia; Sciurognathi ; 

OC Muroidea; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RC STRAIN=C57BL/6J; TISSUE=Whole body; 

RX MEDLINE=99279253; PubMed=10349636; DOI=10 . 1016/S0076-6879 (99) 03004-9; 

RA Carninci P., Hayashizaki Y. ; 

RT "High-efficiency full-length cDNA cloning . " ; 

RL Meth. Enzymol. 303:19-44(1999). 

RN [2] 

RP NUCLEOTIDE SEQUENCE. 

RC STRAIN=C57BL/6J; TISSUE=Whole body; 

RX MEDLINE=21085660; PubMed=11217851 ; DOI=10 . 1038/35055500 ; 

RA Kawai J., Shinagawa A., Shibata K. , Yoshino M. , Itoh M. , Ishii Y., 

RA Arakawa T., Hara A., Fukunishi Y. , Konno H., Adachi J. , Fukuda S., 

RA Aizawa K. , Izawa M. , Nishi K., Kiyosawa H., Kondo S., Yamanaka I., 

RA Saito T., Okazaki Y., Gojobori T., Bono H., Kasukawa T., Saito R. , 

RA Kadota K. , Matsuda H.A., Ashburner M. , Batalov S., Casavant T., 

RA Fleischmann W., Gaasterland T., Gissi C, King B., Kochiwa H., 

RA Kuehl P., Lewis S., Matsuo Y., Nikaido I., Pesole G. , Quackenbush J., 

RA Schriml L.M. , Staubli F. , Suzuki R. , Tomita M. , Wagner L., Washio T., 

RA Sakai K. , Okido T., Furuno M. , Aono H., Baldarelli R., Barsh G., 

RA Blake J., Boffelli D., Bojunga N., Carninci P., de Bonaldo M.F., 

RA Brownstein M.J., Bult C, Fletcher C, Fujita M. , Gariboldi M. , 

RA Gustincich S., Hill D., Hofmann M. , Hume D.A., Kamiya M. , Lee N.H., 

RA Lyons P., Marchionni L., Mashima J., Mazzarelli J., Mombaerts P., 

RA Nordone P., Ring B., Ringwald M. , Rodriguez I., Sakamoto N., 

RA Sasaki H., Sato K. , Schoenbach C, Seya T., Shibata Y., Storch K.-F., 

RA Suzuki H., Toyo-oka K. , Wang K.H., Weitz C, Whittaker C, Wilming L., 

RA Wynshaw-Boris A., Yoshida K. , Hasegawa Y., Kawaji H., Kohtsuki S., 

RA Hayashizaki Y.; 

RT "Functional annotation of a full-length mouse cDNA collection."; 

RL Nature 409:685-690(2001). 

RN [3] 



RP NUCLEOTIDE SEQUENCE. 

RC STRAIN=C57BL/6J; TISSUE=Whole body; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation o 

RT 60,770 full-length cDNAs . " ; 

RL Nature 420:563-573 (2002) . 

RN [4] 

RP NUCLEOTIDE SEQUENCE. 

RC STRAIN=C57BL/6J; TISSUE=Whole body; 

RX MEDLINE=20499374; PubMed=11042159; DOI=10 . 1101/gr . 145100 ; 

RA Carninci P., Shibata Y., Hayatsu N., Sugahara Y. , Shibata K., Itoh M. 

RA Konno H., Okazaki Y., Muramatsu M. , Hayashizaki Y. ; 

RT "Normalization and subtraction of cap-trapper-selected cDNAs to 

RT prepare full-length cDNA libraries for rapid discovery of new genes." 

RL Genome Res. 10:1617-1630(2000). 

RN [5] 

RP NUCLEOTIDE SEQUENCE. 

RC STRAIN=C57BL/6J; TISSUE=Whole body; 

RX MEDLINE=20530913; PubMed=l 107 6861 ; DOI=10 . 1101/gr . 152600; 

RA Shibata K., Itoh M. , Aizawa K., Nagaoka S., Sasaki N., Carninci P., 

RA Konno H., Akiyama J., Nishi K. , Kitsunai T., Tashiro H., Itoh M. , 

RA Sumi N., Ishii Y., Nakamura S., Hazama M. , Nishine T., Harada A. , 

RA Yamamoto R., Matsumoto H., Sakaguchi S., Ikegami T., Kashiwagi K., 

RA Fujiwake S., Inoue K. , Togawa Y., Izawa M., Ohara E. , Watahiki M., 

RA Yoneda Y. , Ishikawa T., Ozawa K. , Tanaka T., Matsuura S., Kawai J. , 

RA Okazaki Y., Muramatsu M. , Inoue Y., Kira A. , Hayashizaki Y. ; 

RT "RIKEN integrated sequence analysis (RISA) system-384-f ormat 

RT sequencing pipeline with 384 multicapillary sequencer."; 

RL Genome Res. 10:1757-1771(2000). 

RN [6] 

RP NUCLEOTIDE SEQUENCE. 

RC STRAIN=C57BL/6J; TISSUE=Whole body; 

RA Adachi J. , Aizawa K., Akahira S., Akimura T., Arai A., Aono H., 

RA Arakawa T., Bono H., Carninci P., Fukuda S., Fukunishi Y., Furuno M. , 

RA Hanagaki T., Hara A. , Hayatsu N., Hiramoto K., Hiraoka T., Hori F., 

RA Imotani K., Ishii Y.^ Itoh M. , Izawa M. , Kasukawa T., Kato H. f 

RA Kawai J., Kojima Y., Konno H. , Kouda M. , Koya S., Kurihara C, 

RA Matsuyama T. f Miyazaki A., Nishi K., Nomura K. f Numazaki R. , Ohno M., 

RA Okazaki Y., Okido T., Owa C, Saito H., Saito R. , Sakai C, Sakai K. # 

RA Sano H. , Sasaki D., Shibata K., Shibata Y., Shinagawa A., Shiraki T., 

RA Sogabe Y. f Suzuki H., Tagami M. , Tagawa A. , Takahashi F. , Tanaka T., 

RA Tejima Y. , Toya T., Yamamura T., Yasunishi A., Yoshida K. , Yoshino M. 

RA Muramatsu M. , Hayashizaki Y. ; 

RL Submitted (JUL-2000) to the EMBL/ GenBank/ DDB J databases. 

DR EMBL; AK017762; BAB30913.1; -; mRNA. 

DR Ensembl; ENSMUSG00000003437 ; Mus musculus. 

DR MGI; MGI: 1923988; 5730511K23Rik . 

DR InterPro; IPR007133; Pafl. 

DR Pfam; PF03985; Pafl; 1. 

KW Hypothetical protein. 

FT NONJTER 377 377 

SQ SEQUENCE 377 AA; 43836 MW; 4ECE00D2D4EF5CEA CRC64; 



Query Match 71.8%; Score 1984; DB 2; Length 377; 

Best Local Similarity 100.0%; Pred. No. 8.7e-86; 

Matches 377; Conservative 0; Mismatches 0; Indels 0; Gaps 



Qy 

Db 



1 



1 



MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 



Qy 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

Qy 121 SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 180 

I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 121 SQQHAKWPWMRKTEYI STEFNRYGI SNEKPEVKI GVSVKQQFTEEEI YKDRDSQITAI E 180 

Qy 181 KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEM 240 

I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 181 KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVI FDSDPAPKDTSGAAALEM 240 

Qy 241 MSQAMIRGl^DEEGNQEVAYFLPVEETLKKRKRDQEEEMDYAPDDWDYKIAREYNWNVK 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 MSQAMIRGl^DEEGNQEVAYFLPVEETLKKRKRDQEEEMDYAPDDWDYKIAREYNWNVK 300 

Qy 301 NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 360 

Qy 361 LEAQEARKAQLENHEPE 377 

I I I I I I I I I I I I I I I I I 
Db 361 LEAQEARKAQLENHEPE 377 



RESULT 13 
Q68F51_XENLA 

ID Q68F51_XENLA PRELIMINARY; PRT; 407 AA. 

AC Q68F51; 

DT 25-OCT-2004 (TrEMBLrel. 28 , Created) 

DT 25-OCT-2004 (TrEMBLrel. 28, Last sequence update) 

DT 25-OCT-2004 (TrEMBLrel. 28, Last annotation update) 

DE LOC446278 protein (Fragment). 

GN Name=LOC446278; 

OS Xenopus laevis (African clawed frog) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Amphibia; Batrachia; Anura; Mesobatrachia; Pipoidea; Pipidae; 

OC Xenopodinae; Xenopus; Xenopus. 

OX NCBI_TaxID=8355; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RC TISSUE=Embryo; 

RX MEDLINE=22341132; PubMed=12454917 ; DOI=10 . 1002/dvdy. 10174 ; 

RA Klein S.L., Strausberg R.L., Wagner L., Pontius J., Clifton S.W., 

RA Richardson P.; 

RT "Genetic and genomic tools for Xenopus research: The NIH Xenopus 

RT initiative."; 

RL Dev. Dyn. 225:384-391(2002). 

RN [2] 

RP NUCLEOTIDE SEQUENCE. 

RC TISSUE=Embryo; 

RX MEDLINE=22388257; PubMed=12 477932 ; DOI=10 . 1073/pnas . 242603899; 



RA Strausberg R.L., Feingold E.A., Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L., Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B . , Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R. F. , Jordan H., Moore T., Max S.I., Wang J., Hsieh F., 

RA Diatchenko L., Marusina K., Farmer A. A. , Rubin G.M., Hong L., 

RA Stapleton M., Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P. J., McKernan K.J., Malek J. A. , Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M. , Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A. , 

RA Fahey J., Helton E., Ketteman M. , Madan A., Rodrigues S., Sanchez A., 

RA Whiting M. , Madan A. , Young A.C., Shevchenko Y. , Bouffard G.G., 

RA Blakesley R.W. , Touchman J.W., Green E . D . , Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M., 

RA Butterfield Y.S.N., Krzywinski M.I., Skalska U., Smailus D.E., 

RA Schnerch A., Schein J.E., Jones S.J.M., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length human 

RT and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

RN [3] 

RP NUCLEOTIDE SEQUENCE. 

RC TISSUE=Embryo; 

RA Klein S., Gerhard D.S.; 

RL Submitted (AUG-2004) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; BC079993; AAH79993.1; -; mRNA. 

DR InterPro; IPR007133; Pafl. 

DR Pfam; PF03985; Pafl; 1. 

FT NON_TER 407 407 

SQ SEQUENCE 407 AA; 47154 MW; 6CE32A7307186F83 CRC64; 



Query Match 70.0%; Score 1935; DB 2; Length 407; 

Best Local Similarity 90.0%; Pred. No. 1.9e-83; 

Matches 367; Conservative 22; Mismatches 15; Indels 4; Gaps 2; 

Qy 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

I I I I I I I II I I I I I I I : I I I I : I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAPTIQTQAQREDGHRSSSHRTVPERSGWCRVKYCNTLPDIPFDPKFITYPFDQNRFVQ 60 

Qy 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I : I I I I 
Db 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVTLDFADEKLLEEEIQAPSSSKR 120 

Qy 121 SQQHAKWPWMRKTEYI STEFNRYGI SNEKPEVKI GVSVKQQFTEEEI YKDRDSQITAI E 180 

I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I : I I I 

Db 121 SQQHAKWPWMRKTEYI STEFNRYGVSNEKPEVKIGVSVKQQFTEEDIYKDRDSQISAIE 180 

Qy 181 KTFEDAQKSISQHYSKPRWPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEM 240 

111:1111 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I : I 
Db 181 KTFDDAQKDISQHYSKPRWPVFIVMPVFPDFKMWINPCAQvTFDSDPAPKDASGTAALDM 240 

Qy 241 MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDWDYKIAREYNWNVK 300 

II I I I I I I I I I I I I I II I I I I I I I : I I I I I I I I I : I I I : I : I I I I I II I I I I I I I 

Db 241 MSQAMIRGMMDEEGNQFVAYFLPGEDTMRKRKRDQEEGLDYMPEDIYDYKIAREYNWNVK 300 

Qy 301 NKASKGYEENYFFIFREGDGWYNELETRVlUiSKRRAKAGVQSGTNALLvVKHRDMNEKE 360 

I I I I I I II I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I : I I I 



Db 301 NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRVKAGVQSGTNALLWKHRDIVIHEKE 360 

Qy 361 LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSD-EEQEKGSSSEKE 407 

I I I I I I I : I I I I I I I I I I I I I I : ::: III I! MM III: 
Db 361 LEAQEARRAQLENHEPEEEEEIEV DRDTQGSDAEEGEKGSGSEKK 405 



RESULT 14 
Q4RRR2 TETNG 



ID Q4RRR2JTETNG PRELIMINARY; PRT; 370 AA. 

AC Q4RRR2; 

DT 13-SEP-2005 (TrEMBLrel. 31, Created) 

DT 13-SEP-2005 (TrEMBLrel. 31, Last sequence update) 

DT 13-SEP-2005 (TrEMBLrel. 31, Last annotation update) 

DE Chromosome 16 SCAF15002, whole genome shotgun sequence. 

GN ORFNames=GSTENG00030053001; 

OS Tetraodon nigroviridis (Green puffer) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
OC Actinopterygii; Neopterygii; Teleostei; Euteleostei; Neoteleostei; 
OC Acanthomorpha; Acanthopterygii ; Percomorpha; Tetraodontif ormes ; 
OC Tetradontoidea; Tetraodontidae; Tetraodon. 
OX NCBI_TaxID=99883; 
RN CH- 
RP NUCLEOTIDE SEQUENCE. 

RA Jaillon O., Aury J.M., Brunet F. , Petit J.L., Stange-Thomann N., 

RA Mauceli E., Bouneau L., Fischer C, Ozouf-Costaz C, Bernot A., 

RA Nicaud S., Jaffe D. , Fisher S., Lutfalla G., Dossat C, Segurens B., 

RA Dasilva C, Salanoubat M. , Levy M. , Boudet N.,. Castellano S., 

RA Anthouard V., Jubin C, Castelli V., Katinka M. , Vacherie B., 

RA Biemont C, Skalli Z., Cattolico L. , Poulain J., De Berardinis V., 

RA Cruaud C, Duprat S., Brottier P., Coutanceau J. P., Gouzy J., 

RA Parra G., Lardier G., Chappie C, McKernan K.J., McEwan P., Bosak S., 

RA Kellis M. , Volff JN., Guigo R. , Zody M.C., Mesirov J., 

RA Lindblad-Toh K. , Birren B., Nusbaum C, Kahn D. , Robinson-Rechavi M. , 

RA Laudet V., Schachter V., Quetier F. , Saurin W., Scarpelli C, 

RA Wincker P., Lander E.S., Weissenbach J., Roest Crollius H.; 

RT "Genome duplication in the teleost fish Tetraodon nigroviridis reveals 

RT the early vertebrate proto-karyotype . " ; 

RL Nature 431:946-957 (2004) . 

RN [2] 

RP NUCLEOTIDE SEQUENCE. 

RG Genoscope; Whitehead Institute Centre for Genome Research; 

RL Submitted (FEB-2004) to the EMBL/ GenBank/DDBJ databases. 

CC -!- CAUTION: The sequence shown here is derived from an 

CC EMBL/ GenBank/DDBJ whole genome shotgun (WGS) entry which is 

CC preliminary data. 

DR EMBL; CAAE01015002 ; CAG08920.1; -; Genomic_DNA. 

SQ SEQUENCE 370 AA; 42817 MW; 22D1431AB0128D36 CRC64; 



Query Match 63.6%; Score 1757.5; DB 2; Length 370; 

Best Local Similarity 90.7%; Pred. No. 3.8e-75; 

Matches 331; Conservative 20; Mismatches 13; Indels 1; Gaps 1; 



Qy 1 MAPTIQTQAQREDGH-RPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFV 59 

M M M M I M I : M I I : I I Hllllllllllllllllllllllllllllllhlll 
Db 1 MAPTIQTQAQREEGHSRPAAHRAIPERSGWCRVKYCNSLPDIPFDPKFITYPFDQHRFV 60 



Qy 

Db 



60 QYKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPbTVLLDPADEKLLEEEIQAPTSSK 119 

J I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I : I I I I : I I I 

61 QYKATSLEKQHKHDLLSEPDLGVTIDLINPDTYRIDPSVLLDPADEKLLEEDIQAPSSSK 120 



Qy 120 RSQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAI 179 

I I I I I I I I I i I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I 
Db 121 RSQQHAKWPWMRKTEYI STEFNRYGVSNEKVEVKI GVSVKQQFTEEEI YKDRDSQI SAI 180 

Qy 180 EKTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVI FDSDPAPKDTSGAAALE 239 

I I I I I I I I I I I : I I I I II I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I II I : I 
Db 181 EKTFEDAQKSITQHYSKPRVTPVEVLPVFPDFKMWINPCAQVI FDSDPAPKDISGPAGVE 240 

Qy 240 MMSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNV 299 

I I I I I II I I I I I I I I I I I I I I I I I hllHIIII II : I I I : I : I I I I I I I I I I I I I 
Db 241 MMSQAMIRGMMDEEGNQFVAYFLPNEDTLRKRKRDFEEGVTDYMPEDLYDYKIAREYNWNV 300 

Qy 300 KNKASKGYEENYFFIFREGDGWYNELETRVRLSKRR7VKAGVQSGTNALLVVKHRDMNEK 359 

I I I I II I I I I I I I II I I : I I I I I I I I I I I I I I I I I I I I I I I II 111:11 I I I I I I I I 
Db 301 KNKASKGYEENYFFIFRDGDGVYYNELETRVRLSKRRAKAGTQSTTNAVLVCKHRDMNEK 360 

Qy 360 ELEAQ 364 

I I I I I 

Db 361 ELEAQ 365 



RESULT 15 
Q9VN55_DROME 

ID Q9VN55_DROME PRELIMINARY; PRT; 538 AA. 

AC Q9VN55; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

DT 10-MAY-2005 (TrEMBLrel. 30, Last annotation update) 

DE CG2503-PA (LD37523p) . 

GN Name=atms; ORFNames=CG2503 ; 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 

OC Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI_TaxID=7227; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RX MEDLINE=20196006; PubMed=l 07 31132 ; DOI=10 . 1126/science . 287 . 5461 . 2185; 

RA Adams M.D., Celniker S.E., Holt R.A., Evans C.A., Gocayne J.D., 

RA Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A. , Galle R.F., 

RA George R.A. , Lewis S.E., Richards S., Ashburner M. , Henderson S.N., 

RA Sutton G. G. , Wortman J.R., Yandell M.D., Zhang Q., Chen L.X., 

RA Brandon R.C., Rogers Y.-H.C, Blazej R. G. , Champe M. , Pfeiffer B.D., 

RA Wan K.H., Doyle C, Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A., An H.-J., Andrews-Pf annkoch C, Baldwin D., 

RA Ballew R.M., Basu A., Baxendale J., Bayraktaroglu L., Beasley E.M., 

RA Beeson K.Y., Benos P.V. , Berman B.P., Bhandari D., Bolshakov S., 

RA Borkova D., Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C., Busam D.A., Butler H., Cadieu E., Center A., Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng Z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson K. , Doup L.E., Downes M., Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin K.J., Evangelista C.C., Ferraz C, Ferriera S., Fleischmann W., 



RA Fosler C, Gabrielian A. E . , Garg N.S., Gelbart W.M., Glasser K., 

RA Glodek A., Gong F. , Gorrell J.H., Gu Z., Guan P., Harris M. , 

RA Harris N.L., Harvey D.A., Heiman T.J., Hernandez J.R., Houck J. , 

RA Hostin D., Houston K.A., Howland T.J., Wei M.-H., Ibegwam C, 

RA Jalali M., Kalush F., Karpen G.H., Ke Z., Kennison J. A., Ketchum K.A. , 

RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S., Kulp D. , Lai Z., 

RA Lasko P., Lei Y. , Levitsky A. A., Li J.H., Li Z., Liang Y., Lin X., 

RA Liu X., Mattei B. , Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G. , Milshina N.V., Mobarry C, Morris J. , Moshrefi A., 

RA Mount S.M., Moy M. , Murphy B., Murphy L., Muzny D.M. , Nelson D. L. , 

RA Nelson D.R., Nelson K.A., Nixon K., Nusskern D.R., Pacleb J.M., 

RA Palazzolo M. , Pittman G.S., Pan S., Pollard J., Puri V., Reese M.G., 

RA Reinert K., Remington K. , Saunders R.D.C., Scheeler F., Shen H., 

RA Shue B.C., Siden-Kiamos I., Simpson M., Skupski M.P., Smith T., 

RA Spier E. , Spradling A.C., Stapleton M. , Strong R. , Sun E. , 

RA Svirskas R. , Tector C, Turner R. , Venter E., Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J., 

RA Williams S.M., Woodage T., Worley K.C., Wu D., Yang S., Yao Q.A. , 

RA Ye J., Yeh R.-F., Zaveri J.S., Zhan M. , Zhang G., Zhao Q. , Zheng L. , 

RA Zheng X.H., Zhong F.N., Zhong W. , Zhou X., Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A. , Myers E.W., Rubin G.M., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster . " ; 

RL Science 287:2185-2195(2000) . 

RN [2] 

RP NUCLEOTIDE SEQUENCE. 

RX MEDLINE=22426065; PubMed=12537568 ; 

RA Celniker S.E., Wheeler D.A. , Kronmiller B., Carlson J.W., Halpern A., 

RA Patel S., Adams M., Champe M., Dugan S.P., Frise E., Hodgson A., 

RA George R.A. , Hoskins R.A. , Laverty T. f Muzny D.M., Nelson C.R., 

RA Pacleb J.M., Park S., Pfeiffer B.D., Richards S. f Sodergren E.J., 

RA Svirskas R. , Tabor P.E., Wan K., Stapleton M. , Sutton G.G., Venter C, 

RA Weinstock G., Scherer S.E., Myers E.W., Gibbs R.A. , Rubin G.M.; 

RT "Finishing a whole-genome shotgun: release 3 of the Drosophila 

RT melanogaster euchromatic genome sequence."; 

RL Genome Biol. 3 : RESEARCH0079-RESEARCH0079 (2002 ) . 

RN [3] 

RP NUCLEOTIDE SEQUENCE. 

RX MEDLINE=22426070; PubMed=12537573; 

RA Kaminker J.S., Bergman CM., Kronmiller B., Carlson J.W., Svirskas R. , 

RA Patel S., Frise E., Wheeler D.A., Lewis S.E., Rubin G.M., 

RA Ashburner M., Celniker S.E.; 

RT "The transposable elements of the Drosophila melanogaster euchromatin: 

RT a genomics perspective."; 

RL Genome Biol. 3 : RESEARCH0084 . 1-RESEARCH0084 . 20 (2002 ) . 

RN [4] 

RP NUCLEOTIDE SEQUENCE. 

RX MEDLINE=22426069; PubMed=12537572 ; 

RA Misra S., Crosby M. A. f Mungall C.J., Matthews B.B., Campbell K.S., 

RA Hradecky P., Huang Y., Kaminker J.S., Millburn G. H . , Prochnik S.E., 

RA Smith CD., Tupy J.L., Whitfield E.J., Bayraktaroglu L. , Berman B.P., 

RA Bettencourt B . R. , Celniker S.E., de Grey A. D.N. J., Drysdale R.A. , 

RA Harris N . L. , Richter J., Russo S., Schroeder A. J., Shu S.Q., 

RA Stapleton M., Yamada C, Ashburner M. , Gelbart W.M., Rubin G.M. , 

RA Lewis S.E.; 

RT "Annotation of the Drosophila melanogaster euchromatic genome: a 

RT systematic review."; 

RL Genome Biol. 3 : RESEARCH0083 . 1-RESEARCH0083 . 22 (2002) . 



RN [5] 

RP NUCLEOTIDE SEQUENCE. 

RG Berkeley Drosophila Genome Project; 

RA Celniker S., Carlson J. , Wan K., Pfeiffer B., Frise E., George R. , 

RA Hoskins R. , Stapleton M. , Pacleb J. , Park S., Svirskas R. , Smith E., 

RA Yu C, Rubin G. ; 

RT "Drosophila melanogaster release 4 sequence."; 

RL Submitted (MAR-2000) to the EMBL/GenBank/DDBJ databases. 

RN [6] 

RP NUCLEOTIDE SEQUENCE. 

RG FlyBase; 

RL Submitted (MAR-2005) to the EMBL/GenBank/DDBJ databases. 

RN [7] 

RP NUCLEOTIDE SEQUENCE. 

RC STRAIN=Berkeley; 

RA Stapleton M. , Brokstein P., Hong L., Agbayani A., Carlson J., 

RA Champe M. , Chavez C, Dorsett V. , Dresnek D., Farfan D. , Frise E., 

RA George R. , Gonzalez M., Guarin H., Kronmiller B., Li P., Liao G., 

RA Miranda A., Mungall C.J., Nunoo J., Pacleb J., Paragas V. , Park S., 

RA Patel S., Phouanenavong S., Wan K. , Yu C, Lewis S.E., Rubin G.M., 

RA Celniker S. ; 

RL Submitted (DEC-2001) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; AE003605; AAF52095.1; -; Genomic_DNA. 

DR EMBL; AY070561; AAL48032.1; -; mRNA. 

DR Ensembl; CG2503; Drosophila melanogaster. 

DR FlyBase; FBgn0010750; atms . 

DR FlyBase; FBgn0010750; CG2503. 

DR InterPro; IPR007133; Pafl. 

DR Pfam; PF03985; Pafl; 1. 

SQ SEQUENCE 538 AA; 60794 MW; D55E95B4F4EB8E51 CRC64; 



Query Match 45.0%; Score 1244.5; DB 2; Length 538; 

Best Local Similarity 50.0%; Pred. No. 7.5e-51; 

Matches 271; Conservative 66; Mismatches 172; Indels 33; Gaps 11; 

Qy 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

I I I I I : I : I :: I I I I I I : I I I I I I I II : I I I I : I I I I 

Db 1 MPPTINNSAVNSAAEK-RPQRQTERKSEIICRVKYGNNLPDIPFDLKFLQYPFDSHRFVQ 59 

Qy 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

I MM: I : I : I I I I I I I I : I I I I : I : I I I I I II I I I I I I I II I I 
Db 60 YNPTSLERNFKYDVLTEHDLGVTVDLINRELYQADSMTLLDPADEKLLEEETLTPTDSVR 119 

Qy 121 SQQHAKWPWMRKTEYI STEFNRYGI SN-EKPEVKI GVSVKQQFTEEEI YKDRDSQITAI 179 

I : I I : : I I : I I : I I I I I I I : II I I : I : I I : I I : I I I : : I I II 
Db 120 SRQHSRTVSWLRKSEYISTEQTRFQPQNLENIEAKVGYNVKKSLREETLYLDREAQIKAI 179 

Qy 180 EKTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVI FDSDPAPKDTSGAAALE 239 

Mill: I : : I II I I I II I I : I : I II I I I II II I II I I I II : III 
Db 180 EKTFSDTKSEITKHYSKPNWPvlTvXPIFPDFTNWKFPCAQVIFDSDPAPAGKNVPAQLE 239 

Qy 240 MMSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNV 299 

II II I I II : II I I II II II II I : II : II : I I : : I : I I I I I II I I I I 

Db 240 EMSQAMIRGVMDESGEQFVAYFLPTEQTLEKRRTDFINGELYKEEEEYEYKIAREYNWNV 299 

Qy 300 KNKASKGYEENYFFIFREGDGVYYNELETRVT^LSKRRAKAGVQSGTNALLVVKHRDMNEK 359 

I I II I II II I I II : I : II : I II II I II I M : I II I I I I I M II I :: 



Db 300 KTKASKGYEENYFFVMRQ-DGIYYNELETRVRLNKRRVKVG-QQPNNTKLWKHRPLDSM 357 

Qy 360 ELEAQEARKAQLENHEPEEE EEEEM ETEE KEAGGSD 395 

I Ihlll III I I I : I III: : I I : I 

Db 358 EH RMQ R YRE RQL EVP GE E EE I VE EVREE EQMQ 1 1 GET EKT S E DAAVGAQAAS GAD S P AQV 417 

Qy 396 --EEQEKGSSSEKEGSEDEHSGSESEREEGDRDEASDKSGSGEDESSEDEARAARDKEEI 453 

: I : I : I I I I I I I :: : I I I I : I : : 

Db 418 ARDRQSRSRSRTRSGS-SSGSGSGSGSRASSRSKSGSRSGSGSRSRTNSPAGSQKSGSR- 475 

Qy 454 FGSDADSEDDADSDDEDRGQAQGGSDNDSDSGS-NGGGQRSRSHSRSASPFPSGSEHSAQ 512 

I : I : I I ::: I : I I I I : I I I I I I I I I I III : 
Db 476 SRSVSRSRSRSKSGSRSRSRSRSKSGSRSRSGSRSGSGSRSPSRSRSGSPSGSGSSSGSA 535 

Qy 513 ED 514 

I 

Db 536 SD 537 



Search completed: April 25, 2006, 09:11:07 
Job time : 236 sees 



GenCore version 5.1.7 
Copyright (c) 1993 - 2006 Biocceleration Ltd. 



OM protein - protein search, using sw model 

Run on: April 25, 2006, 09:11:27 ; Search time 46 Seconds 

(without alignments) 
954.365 Million cell updates/sec 

Title: US-10-721-553-2 
Perfect score: 2764 

Sequence: 1 MAPTIQTQAQREDGHRPNSH QEDGSEAAASDSSEADSDSD 531 

Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 572060 seqs, 82675679 residues 

Total number of hits satisfying chosen parameters: 572060 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : Issued_Patents__AA: * 

1: /cgn2_6/ptodata/l/iaa/5_COMB.pep:* 

2: /cgn2_6/ptodata/l/iaa/6_COMB.pep: * 

3: /cgn2_6/ptodata/l/iaa/H_COMB.pep: * 

4 : / cgn2_6/ptodata/ 1/iaa/ PCTUS_COMB . pep : * 

5: /cgn2_6/ptodata/l/iaa/RE_COMB.pep:* 

6 : / cgn2_6/ptodata/l/iaa/backf iles 1 . pep : * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 



Result 
No. 


Score 


Query 
Match 


Length DB 


ID 


Description 


1 


2764 


100.0 


531 


1 


US-08-933-750C-9 


Sequence 9, Appli 


2 


2764 


100.0 


531 


2 


US-09-234-613-9 


Sequence 9, Appli 


3 


2764 


100.0 


531 


2 


US-09-647-143-2 


Sequence 2, Appli 


4 


595 


21.5 


115 


2 


US-09-513-999C-7407 


Sequence 7407, Ap 


5 


266 


9.6 


481 


2 


US-09-248-796A-18683 


Sequence 18683, A 


6 


234.5 


8.5 


1742 


2 


US-09-386-962C-4 


Sequence 4, Appli 


7 


234.5 


8.5 


1742 


2 


US-09-386-959-4 


Sequence 4, Appli 


8 


233.5 


8.4 


930 


2 


US-09-200-650E-3 


Sequence 3, Appli 


9 


226 


8.2 


918 


2 


US-09-200-650E-1 


Sequence 1, Appli 


10 


225.5 


8.2 


933 


2 


US-08-293-728-2 


Sequence 2, Appli 


11 


225.5 


8.2 


933 


2 


US-09-421-868-2 


Sequence 2, Appli 



12 


225.5 


8.2 


936 


2 


US- 


08- 


956- 


171E-5249 


Sequence 


0 c. H y , r\p 


13 


225.5 


8.2 


936 


2 


us- 


08- 


781- 


986A-5249 


Sequence 


OZ 4 y t Ap 


14 


223.5 


8.1 


1315 


2 


us- 


09- 


200- 


650E-5 


Sequence 


5/ Appli 


15 


220.5 


8.0 


1259 


2 


us- 


09- 


949- 


016-10366 


Sequence 


IUjDD, t\ 


16 


218 


7.9 


1166 


2 


us- 


09- 


200- 


650E-7 


Sequence 


/, App 11 


17 


212.5 


7.7 


287 


2 


us- 


09- 


710- 


279-468 


Sequence 


4DCJ, App 


18 


212.5 


7.7 


1092 


2 


us- 


09- 


•147- 


405B-15 


Sequence 


10 , /\pp± 


19 


203.5 


7.4 


414 


2 


us- 


09- 


•248- 


796A-19046 


Sequence 


1j U4 O, A 


20 


203 


7.3 


257 


2 


us- 


•09- 


•461- 


•697-188 


Sequence 


loo, App 


21 


203 


7.3 


272 


2 


us- 


•09- 


-461- 


•697-186 


Sequence 


lOD, App 


22 


199.5 


7.2 


599 


2 


us- 


•09- 


■538- 


•092-864 


Sequence 


OD^ f App 


23 


198.5 


7.2 


238 


2 


us- 


•09- 


-461- 


•697-190 


Sequence 


inn 7v»^»^ 

l y u f App 


24 


196 


7.1 


781 


2 


us- 


■09- 


-949- 


•016-9773 


Sequence 


y / / o , Ap 


25 


195 


7.1 


231 


2 


us- 


•09- 


-461- 


-697-194 


Sequence 


1 O /I Ann 

i y 4 , App 


26 


195 


7.1 


232 


2 


us- 


•09- 


-461- 


-697-192 


Sequence 


i y z r App 


27 


195 


7.1 


764 


2 


us- 


-09- 


-538- 


-092-944 


Sequence 


y 4 4 , App 


28 


191 


6.9 


598 


2 


us- 


-09- 


-538- 


-092-1083 


Sequence 


lUoo, Ap 


29 


189 


6.8 


764 


2 


us- 


-09- 


-370- 


-838-67 


Sequence 


of, Appi 


30 


189 


6.8 


764 


2 


us- 


-09- 


-854- 


-133-67 


Sequence 


Off Appi 


31 


186 


6.7 


3135 


1 


us- 


-08- 


-323- 


-170B-2 


Sequence 


2, Appli 


32 


186 


6.7 


3135 


2 


us- 


-08- 


-954- 


-441-2 


Sequence 


2, Appli 


33 


185.5 


6.7 


1162 


1 


us- 


-08- 


-728- 


-323A-2 


Sequence 


Z, App±l 


34 


185.5 


6.7 


1162 


2 


us- 


-09- 


-298- 


-568-2 


Sequence 


2f Appli 


35 


185.5 


6.7 


1162 


2 


us- 


-09- 


-410- 


-399-2 


Sequence 


2, Appli 


36 


185.5 


6.7 


1162 


2 


us- 


-09- 


-894- 


-273-2 


Sequence 


2, Appli 


37 


185 


6.7 


402 


2 


us- 


-09- 


-248- 


-796A-18910 


Sequence 


lb91U f A 


38 


184 


6.7 


674 


2 


us- 


-08- 


-893- 


-852A-1 


Sequence 


1, Appli 


39 


183 


6.6 


486 


2 


us- 


-09- 


-710- 


-279-788 


Sequence 


788, App 


40 


180 


6.5 


487 


2 


us- 


-09- 


-386- 


-962C-14 


•sequence 


1 A Annl 


41 


180 


6.5 


487 


2 


us- 


-09- 


-386- 


-959-65 


Sequence 


65, Appi 


42 


178.5 


6.5 


40 


2 


us- 


-09 


-647- 


-143-16 


Sequence 


16, Appi 


43 


178.5 


6.5 


1269 


2 


us- 


-09 


-949- 


-016-7349 


Sequence 


7349, Ap 


44 


178.5 


6.5 


1269 


2 


us 


-09 


-949 


-016-7350 


Sequence 


7350, Ap 


45 


178 


6.4 


1444 


2 


us 


-09 


-949 


-016-9652 


Sequence 


9652, Ap 



ALIGNMENTS 



RESULT 1 

US-08-933-750C-9 

Sequence 9, Application US/08933750C 
Patent No. 5932442 
GENERAL INFORMATION: 

APPLICANT: Lai, Preeti 
APPLICANT: Hillman, Jennifer L. 
APPLICANT: Bandman, Olga 
APPLICANT: Shah, Purvi 
APPLICANT: Au- Young, Janice 
APPLICANT: Yue, Henry 
APPLICANT: Guegler, Karl J. 
APPLICANT: Corley, Neil C. 

TITLE OF INVENTION: HUMAN REGULATORY MOLECULES 
NUMBER OF SEQUENCES: 98 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Incyte Pharmaceuticals, Inc. 
STREET: 3174 Porter Drive 



CITY: Palo Alto 
STATE: CA 
COUNTRY: USA 
ZIP: 94304 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ for Windows Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/933, 750C 
FILING DATE: September 23, 1997 
CLASSIFICATION: 536 
PRIOR APPLICATION DATA: 
APPLICATION NUMBER: 
FILING DATE: 
ATTORNEY/AGENT INFORMATION: 
NAME: Billings, Lucy J. 
REGISTRATION NUMBER: 36,749 
REFERENCE/ DOCKET NUMBER: PF-0356 US 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 415-855-0555 
TELEFAX: 415-845-4166 
TELEX : 

INFORMATION FOR SEQ ID NO: 9: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 531 amino acids 
TYPE: amino acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
IMMEDIATE SOURCE: 

LIBRARY: PITUNOR01 
CLONE: 98974 
US-08-933-750C-9 

Query Match 100.0%; Score 2764; DB 1; Length 531; 

Best Local Similarity 100.0%; Pred. No. 1.4e-227; 

Matches 531; Conservative 0; Mismatches 0; Indels 0, Gaps 

MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVTCYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

I I I I I I | | | | | | I I I I I I II I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I M I 

MAPTIQTQAQRE^ 60 



Qy 


l 


Db 


l 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 



YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 

■ | - , , | , | | | | | ! | | | | | || | | | | | | | | | | | II I I I I I I II I I I I I I I I I I I I I I 1 1 I I I 
YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 



120 
120 



121 SOQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEI YKDRDSQITAIE 180 

I I I I | I | I I | | I I I I I I I I II I II I I I I I I I I I II I M I I I I I M I I M I I I I I I M I I I 
SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 



KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVI FDSDPAPKDTSGAAALEM 

I , | I | | | | | | I I I I I I I I I I I I I I I I I I I I I | I I I I I I I I I I I I I I I I I I I I I I 

KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVI FDSDPAPKDTSGAAALEM 

MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 
HIM | | | I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I M I I I I I I I I I I 



180 
240 
240 
300 



Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 


Qy 


481 


Db 


481 



MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 
101 MKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 

VTTTTTTT M I I I I M I I I I II I I M I M M I I I M I M I M I M M I I M I M M M I I 
NKAS KG YEEN YFFI FREGDGVY YN ELET RVRL S KRRAKAGVQ S GTNALL WKHRDMN EKE 

LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGS^ 

MlilMII I | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 



300 
360 
360 
420 
420 



480 



EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGG^ 480 

I I I I I I I I I I I I | | | | I I II I I I I I I I M M I I I I I I I I I I I I I I I I I M I I I I I I I I M 
EEGDRDEASDKS GSGEDES S EDEARAARDKEEI FGS DAD S EDDADS DDEDRGQAQGGS DN 

DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 

Ml INI II I M Mil Mill III HI I I INN II MINIMI II I II I 

DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 



RESULT 2 
US-09-234-613-9 

Sequence 9, Application US/09234613 
Patent No. 6132973 
GENERAL INFORMATION: 

APPLICANT: Lai, Preeti 
APPLICANT: Hillman, Jennifer L. 
APPLICANT: Bandman, Olga 
APPLICANT: Shah, Purvi 
APPLICANT: Au- Young, Janice 
APPLICANT: Yue, Henry 
APPLICANT: Guegler, Karl J. 
APPLICANT: Corley, Neil C. 

TITLE OF INVENTION: HUMAN REGULATORY MOLECULES 
NUMBER OF SEQUENCES: 98 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Incyte Pharmaceuticals, Inc. 
STREET: 3174 Porter Drive 
CITY: Palo Alto 
STATE: CA 
COUNTRY: USA 
ZIP: 94304 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ for Windows Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/234,613 
FILING DATE: 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US/08/933,750 
FILING DATE: September 23, 1997 
ATTORNEY/AGENT INFORMATION: 
NAME: Billings, Lucy J. 
REGISTRATION NUMBER: 36,749 
REFERENCE/ DOCKET NUMBER: PF-0356 US 



TELECOMMUNICATION INFORMATION: 
TELEPHONE: 415-855-0555 
TELEFAX: 415-845-4166 
TELEX: 

INFORMATION FOR SEQ ID NO: 9: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 531 amino acids 
TYPE: amino acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
IMMEDIATE SOURCE: 

LIBRARY: PITUNOR01 
CLONE: 98974 
US-09-234-613-9 

Query Match 100.0%; Score 2764; DB 2; Length 531; 

Best Local Similarity 100.0%; Pred. No. 1.4e-227; 

Matches 531; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

I I I I I M I I I I I I I M I I I I I II I I I I I II I I I I I I I I I I I II I I I I I II I I I I 

Db 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

Qy 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

| | | | | | | | | I I I I I I I I I I I I I I I I I I I II I I I M I I I I I I I I I I I I I I I II I I 

Db 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

Qy 121 SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I 

Db 121 SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQIT7VIE 180 

Qy 181 KT FE DAQ KS I S QH Y S K P RVT PVEVMP VFP D FKMWI N P CAQVI FD S D PAP KDT S GAAALEM 240 

M I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 

Db 181 KT FE DAQ KS I S QH Y S K P RVT PVEVMP VFP D FKMWI N P CAQVI FD S D PAP KDT S GAAAL EM 240 

Qy 241 MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 300 

II II I Mill I I I I I M I I II Ml I I II I M I M I M M I M I M I I I M I M I 

Db 241 MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 300 

Qy 301 NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 360 

I I M I I I II I I I II I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I M 

Db 301 NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 360 

Qy 361 LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 420 

| I I I I I I I I I I I I I I I I I I I I II I I I I M M I I I II I M I II I I II I I I I I II I I 

Db 361 LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 420 

Qy 421 EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 480 

| I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I II I 

Db 421 EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 480 

Qy 481 DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 

| I I I I I I M I I I I I M I II I I I I I M I I I I I M I I I I I I I I I I I I I I I I II 
Db 481 DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 



RESULT 3 



US-09-647-143-2 

; Sequence 2, Application US/09647143 

; Patent No. 6680196 

; GENERAL INFORMATION: 

; APPLICANT: Batra, Surinder K. 

; APPLICANT: Hollingsworth, Michael A. 

; APPLICANT: University of Nebraska Board of Regents 

; TITLE OF INVENTION: No. 6680196el Gene That is Amplified and 

; TITLE OF INVENTION: Overexpressed in Cancer and Methods of Use Thereof 

; FILE REFERENCE: UNMC63121 

; CURRENT APPLICATION NUMBER : US/09/647, 143 

; CURRENT FILING DATE: 2000-09-27 

; PRIOR APPLICATION NUMBER: PCT/US99/06633 

; PRIOR FILING DATE: 1999-03-26 

; PRIOR APPLICATION NUMBER: 60/079,649 

; PRIOR FILING DATE: 1998-03-27 

; NUMBER OF SEQ ID NOS : 22 

; SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 2 

LENGTH: 531 
; TYPE: PRT 

; ORGANISM: Homo sapiens 
US-09-647-143-2 

Query Match 100.0%; Score 2764; DB 2; Length 531; 

Best Local Similarity 100.0%; Pred. No. 1.4e-227; 

Matches 531; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

M I I I I I I I I I I I I I I I I I I I I I I I I I M I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

II I I I I I I I II I I I I I I I I II I I I I I I I I I I II I I I I I I I 

61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

121 SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M M I I I 
121 SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 180 

181 KT FE D AQ K S I S QH Y S K P RVT P VEVMP VFP D FKMW I N P CAQVI FD S D PAP KDT S GAAALEM 240 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
181 KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEM 240 

241 MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 300 

M | I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I M I I I I I I I I I I I I I 
241 MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 300 

301 NKASKGYEENYFFIFREGDGWYNELETRWLSKRRAKAGVQSGTNALLVVKHRDMNEKE 360 

| | | | I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
301 NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 360 

361 LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 420 

| | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
361 LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 420 

421 EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 480 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 



Hill | I | M I I I I I I Ml I I II I I I I I I I I I M I I I I IIIMIIIIIM 

Db 421 EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 480 

Qv 481 DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 

M I I I I I I I I I I I I M MINIUM I I I I I 

Db 481 DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 



RESULT 4 

US-09-513-999C-7407 

; Sequence 7407, Application US/09513999C 
; Patent No. 6783961 
; GENERAL INFORMATION: 

; APPLICANT: Dumas Milne Edwards , J.B. 

; APPLICANT: Duclert, A. 

; APPLICANT: Giordano, J.Y. 

TITLE OF INVENTION: Expressed Sequence Tags and Encoded Human Proteins. 

; Patent No. 6783961 

; FILE REFERENCE: 59.US2.REG 

; CURRENT APPLICATION NUMBER: US/09/513, 999C 

; CURRENT FILING DATE: 2000-02-24 

; PRIOR APPLICATION NUMBER: US 60/122,487 

; PRIOR FILING DATE: 1999-02-26 

; NUMBER OF SEQ ID NOS: 36681 

; SOFTWARE: Patent. pm 

; SEQ ID NO 7407 

LENGTH: 115 

TYPE: PRT 
; ORGANISM: Homo sapiens 

FEATURE: 

NAME/ KEY: UNSURE 
; LOCATION: 25 

OTHER INFORMATION: Xaa=Glu or Gly 

FEATURE : 
; NAME/ KEY: UNSURE 

LOCATION: 26 
; OTHER INFORMATION: Xaa=Arg or Ser 
; FEATURE : 

NAME/ KEY: UNSURE 

LOCATION: 110 

OTHER INFORMATION: Xaa=Glu or Gly 
FEATURE: 

NAME/ KEY: UNSURE 
LOCATION: 114 

OTHER INFORMATION: Xaa=Ala or Gly 
US-09-513-999C-7407 

Query Match 21.5%; Score 595; DB 2; Length 115; 

Best Local Similarity 96.5%; Pred. No. 2.8e-43; 

Matches 111; Conservative 0; Mismatches 4; Indels 0; Gaps 
Qv i MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

| | | | | | | I I I I I I M M M M II I MM MM I I M M I I I I I I I M I I II I I 

Db i MAPTIQTQAQREDGHRPNSHRTLPXXSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 



Qy 



61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAP 115 
| | | | I I II II I II M I I I I M I I I I II I I I II M M I I I Ml I 



61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEXEIQXP 115 



RESULT 5 

US-09-248-796A-18683 

Sequence 18683, Application US/09248796A 
Patent No. 6747137 
GENERAL INFORMATION: 
APPLICANT: Keith Weinstock et al 

TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO CANDIDA 
ALBICANS 

TITLE OF INVENTION: FOR DIAGNOSTICS AND THERAPEUTICS 
FILE REFERENCE: 107196.132 

CURRENT APPLICATION NUMBER: US/09/248, 796A 
CURRENT FILING DATE: 1999-02-12 
PRIOR APPLICATION NUMBER: US 60/074,725 
PRIOR FILING DATE: 1998-02-13 
PRIOR APPLICATION NUMBER: US 60/096,409 
PRIOR FILING DATE: 1998-08-13 
NUMBER OF SEQ ID NOS : 28208 
SEQ ID NO 18683 
LENGTH: 481 
TYPE: PRT 

ORGANISM: Candida albicans 
US-09-248-796A-18683 

Query Match 9.6%; Score 266; DB 2; Length 481; 

Best Local Similarity 22.0%; Pred. No. 2.8e-14; 

Matches 110; Conservative 91; Mismatches 156; Indels 142; Gaps 20; 

Qy 18 NSHRTL-PERSGWCRVKYCNSLPDIPFDPKFITY PFDQNRFVQYKATSL-EKQHK 71 

:|:::| | | : :|:| hll I : I I It I I : :l =11 |:: 
Db 16 SSNKSLKPIRQDYIAKVRYTNNLPPPPLNPKFIEYNTTDPISTQQEGEYLISSLFRKENF 75 



Qy 



72 HDLLTEPD — LGVTIDLINPDTY RIDPN VLLDPADEKLLEE 110 

:|: I II: ::IM : :: II :| I |:|: 

Db 76 QNLMERIDDQLGLDLNLINNRGFLSEDKMNESVGKLKYNQLHPNDRALLRDAGIGKILKN 135 

Qy in EIQAPTSSKRSQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIG VSVKQ 161 

| : I : : I : I I I I I : : I I I : I I ' 

Db 136 EPE VSFLRRTEYIS DRPLSKGGNNLNTATEEIKVKE 171 

Qy 162 QFTEEEIYKDRDSQITAIEKTFEDAQKSI SQHYSKPRVTPVEVMPVFPDFKMWIN 216 

: :::| : I III: :|::| I :|: :l I : I I'M I I 

Db 172 KLSKDEHF-DADSQLQNVEESFTVANESLYDLKNIKHPKKKHLRAVNTWPLLPDTSMLDN 230 

Qv 217 PCAQVI F-DSDPAPKDTSGAAALEMMSQAMI RGMMDE EGNQFV 258 

: | | :: : : I II M ::: 

Db 231 VFINLRFMGS AS INRELNNLKQQQQQQQQQNDKKFDEKLFDRALES S LFKPI KLEGGEWI 290 

Qy 259 AYFL— PVEETLKKRKRDQEEEMDYAPDDVYD— YKIAREYNWNV KNKA 303 

: : | : III:: : I : I : : : I I : I ' 

Db 291 SMYLLDATNTSTTANDNDNEEQI NDLYEKLHSTKKEQPINLLDEDEESLETYKFKY 346 

Q V 304 SKGYEENYFFI FREGDGV YYNELETRVRLSKRRA- 337 

: | | : | | : : I I : : : I I I I 

Db 347 TKNYDMTYQPFEHENEELAIKFVSDEIEDPVSKDNFKRKRKMAYYYPINGKIELKKHRAS 406 



nv 33 8 KAGVQSGTNALLWKHRDMNEKELEAQEARKAQLENHEPEEEEEEEMETEE 388 

Y " ~ i I I * I | :: | | :::::: I I I : I I I I I I 
Db 407 TNSEINKFIKERTYDGINFIL REPSTNELKRLDTIRSEYDPMEYEGEDEEEEEEEE 462 

Qy 389 KEAGGSDEEQEKGSSSEKE 407 

: i : I : I : : : : : I 
Db 463 EEEPLEEEQQQQEVETKEE 481 

RESULT 6 

US-09-386-962C-4 

Sequence 4, Application US/09386962C 
Patent No. 6635473 
GENERAL INFORMATION: 

TITLE C OF T INVENTION: T POLYPEPTIDES AND POLYNUCLEOTIDES FROM COASULASE-NECATIVE 

STAPHYLOCOCCI 

FILE REFERENCE: P06335US2/BAS 
CURRENT APPLICATION NUMBER: US/09/386, 962C 
CURRENT FILING DATE: 1999-08-31 
PRIOR APPLICATION NUMBER: 60/098,443 
PRIOR FILING DATE: 1998-08-31 
PRIOR APPLICATION NUMBER: 60/117,119 
PRIOR FILING DATE: 1999-01-25 
NUMBER OF SEQ ID NOS: 38 
SOFTWARE: Patentln version 3.0 
SEQ ID NO 4 
LENGTH: 1742 
TYPE: PRT 

ORGANISM: Staphylococcus epidermidis 
US-09-386-962C-4 

Query Match 8.5%; Score 234.5; DB 2; Length 1742; 

Best Local Similarity 21.0%; Pred. No. 8.7e-ll; 

Matches 127; Conservative 88; Mismatches 240; Indels 149; Gaps 24, 



Qy 


48 


FITYPFDQNRFVQYKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDP NVLLDP 


102 


Db 


716 


..i |l:| 1 :: 1 1 I" M ' 1 1 :| 
YVTLKDSNNRELQRVTTDQSGHYQFDNLQNGT YTVEFAIPDNYTPSPANNSTNDAIDS 


773 


Qy 


103 ADEKLLEEEIQAPTS S KRSQQHAKV VPWMRKT EYI S T EFNRYGI SNEKPEVKI G 

i i • I • • 1 : 1 1 : : : 


156 


Db 


774 


DGERDGTRKWVAKGTINNADNMTVDTGFYLTPKYNVGDYVWEDTNKDGIQDDNEKGISG 


833 


Qy 


157 


|| 1 :|| 1 1 : :M 


175 


Db 


834 


VKVTLKNKNGDTIGTTTTDSNGKYEFTGLENGDYTIEFETPEGYTPTKQNSGSDEGKDSN 


893 


Qy 


176 


ITAIEKTFEDA-QKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDP 


227 


Db 


894 


1 1 : 1 | I : 1 : 1 1 1 : : 1 s II 
GTKTTVTVKDADNKTIDSGFYKPTYN LGDY-VWEDTNKDGIQDDSEKGISGVK 


945 


Qy 


228 


-APKDTSGAA ALEMMSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDY- 


281 


Db 


946 


|| :| | : :|: 1 1: 1 : 1 1 1 ' : : < 
VTLKDKNGNAIGTTTTDASGHYQFKGL — ENGSYTVEFETPSGYTPTKANSGQDITVDSN 


1003 



Qy 282 APDDVYD YKIAR EYNWNVKNK ASKGY 307 

10 04 GITTTGIINGMNLTIDSGFYKTPKYSVGDYWEDTNKDGIQDDNEKGISGVKVTLKDEK 1063 



Db 



308 EENYFFIFREG D-GVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMN 357 

QY ^ ..,.,111 I -i__L;i NDD 1109 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



• | | : I III I ' 
1064 GN I I S TTT T DEN GKYQ FDNL D S GN Y I I H FEK P EGMTQTT AN S G- 

358 EKELEAQEARKA qlenHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSE 410 

1110 EKDADGEDVRVTITDHDDFSIDNGYFDM 1169 
411 DEHSGSESERE-EGDRDEASDK-SGSGEDESSEDEARAARDKEE1FGSDADSEDDADSDD 468 

ii 7 o ULsdsdsdadsdsdsdIdsdsdLsdadsdsdsdsdsdadsdsdsdsdsdsdsdsds 1229 

469 EDRGQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDS-SEAE) 527 
1 230 DSDSDSDSDSDSDSDSDSDSD SDSDADSDSDADSDSDSDSDSDADSDSDSDSDSDAD 1286 



Qy 528 SDSD 531 

II I I 

Db 1287 SDSD 1290 



RESULT 7 
US-09-386-959-4 

~ Sequence 4, Application US/09386959 
Patent No. 6703025 
GENERAL INFORMATION: 
APPLICANT: PATTI, Joseph M. 
APPLICANT: FOSTER, Timothy J. 
APPLICANT: HOOK, Magnus tr ^ pTM _. 
TITLE OF INVENTION: MULT I COMPONENT VACCINES 
FILE REFERENCE: P06333US1/BAS 
CURRENT APPLICATION NUMBER: US/09/386, 959 
CURRENT FILING DATE: i 999 " 08 ^! 
EARLIER APPLICATION NUMBER: 60/098,439 
EARLIER FILING DATE: 1999-08-31 
NUMBER OF SEQ ID NOS: 65 
SOFTWARE: Patentln Ver. 2.0 
SEQ ID NO 4 
LENGTH: 1742 
TYPE: PRT 

ORGANISM: Staphylococcus epxdermidis 
US-09-386-959-4 

M . . 8 .5%; Score 234.5; DB 2; Length 1742; 

Query Match ZZ ^ ^ m« ft 7^-11* 

Q , 48 FITYPFDQNRFVQYKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDP NVLLDP 102 

,16 Y ULKOSHiU R VT™ 0 S=„Y0PD»U=T--YTWAIP D[ ,YTPSP M NSTND M DS „3 
103 ADEKLLEEEIQAPTSSKRSQQHAKV- VP^EYXSTE^RYGISHEKPEVKtS 156 



Db 



774 DGERDGTRKVWAKGTINNADNMTVDTGFYLTPKYNVGDYVWEDTNKDGIQDDNEKGISG 833 



Qy 157 VSV KQQFT — EEEIY KDRDSQ 175 

|| I : II I I : : I I 

Db 834 VKVTLKNKNGDTIGTTTTDSNGKYEFTGLENGDYTIEFETPEGYTPTKQNSGSDEGKDSN 893 

Qy 176 ITAIEKTFEDA-QKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDP 227 

I I :|| 1:1 : II I: :l : I I 

Db 894 GTKTTVTVKDADNKTIDSGFYKPTYN LGDY-VWEDTNKDGIQDDSEKGI SGVK 945 

Qy 228 -APKDTSGAA ALEMMSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDY- 281 

| | : | | : :|: I I: i : I II I s 2 I 

Db 946 VTLKDKNGNAIGTTTTDASGHYQFKGL — ENGSYTVEFETPSGYTPTKANSGQDITVDSN 1003 



Qy 



2 8 2 APDDVYD YKIAR EYNWNVKNK ASKGY 307 

I : I II : :| I II M 
Db 1004 GITTTGIINGADNLTIDSGFYKTPKYSVGDYVWEDTNKDGIQDDNEKGISGVKVTLKDEK 1063 



Qy 308 EENYFFIFREGD-GVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMN 357 

: || : I I I I I : : : I : I : : 

Db 1064 GNIISTTTTDENGKYQFDNLDSGNYIIHFEKPEGMTQTTANSG NDD 1109 

Qy 358 EKELEAQEARKA QLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSE 410 

| | : : : : I : : I ::::::: | | : : | : : : 

Db 1110 EKDADGEDVRVTITDHDDFSIDNGYFDDDSDSDSDADSDSDSDSDSDADSDSDADSDSDA 1169 

Qy 411 DEHSGSESERE-EGDRDEASDK-SGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDD 468 

| | I : I : : : I I II II I I : : : : I : I I : I I : I : I I I 
Db 1170 DSDSDSDSDSDADSDSDSDSDSDSDSDSDADSDSDSDSDSDADSDSDSDSDSDSDSDSDS 1229 

Qy 469 EDRGQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDS-SEAD 527 

: ||:llll I: I I : I I I I: : I = Ml Ml 

Db 1230 DSDSDSDSDSDSDSDSDSDSD SDSDADSDSDADSDSDSDSDSDADSDSDSDSDSDAD 1286 

Qy 528 SDSD 531 

I I I I 

Db 1287 SDSD 1290 



RESULT 8 

US-09-200-650E-3 

; Sequence 3, Application US/09200650E 

; Patent No. 6680195 

; GENERAL INFORMATION: 

; APPLICANT: Patti, Joseph M. 

; APPLICANT: Foster, Timothy J. 

; APPLICANT: Hook, Magnus A.O. 

; APPLICANT: Eidhinn, Deirdre Ni 

; APPLICANT: Perkins, Samuel L. 

; TITLE OF INVENTION: Extracellular Matrix-Binding Proteins from Staphylococcus 
aureus 

; FILE REFERENCE: P06283US2/BAS 

; CURRENT APPLICATION NUMBER: US/09/200, 650E 

; CURRENT FILING DATE: 1998-11-25 

; PRIOR APPLICATION NUMBER: 60/066,815 

; PRIOR FILING DATE: 1997-11-26 

; PRIOR APPLICATION NUMBER: 60/098,427 



PRIOR FILING DATE: 1998-08-31 
NUMBER OF SEQ ID NOS: 23 
SOFTWARE: Patentln Ver. 2.0 
SEQ ID NO 3 
LENGTH: 930 
TYPE: PRT 

ORGANISM: Staphylococcus aureus 
US-09-200-650E-3 

Query Match 8.4%; Score 233.5; DB 2; Length 930; 

Best Local Similarity 20.5%; Pred. No. 4.3e-ll; 

Matches 130; Conservative 77; Mismatches 211; Indels 215; Gaps 25; 

Oy 53 FDQNRFVQYKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEI 112 

| : I I : I : : I I : II : III IN 
298 FEQVAFAKRKNATTDK TAYKMEVT LGNDTY SEEI 331 



Db 

Qy 



113 QAPT S S KRSQQHAKWPWMRKT E YI ST E 140 

:|::| I : I II: I 

Db 332 IVDYGNKKAQ PLISSTNYINNEDLSRNMTAYVNQPKNTYTKQTFVTNLTGYKFN 385 



Qy 



141 FNRYGISNEK PEVKIGVSVKQQFTEEEIYKDRDSQITAIEKTFEDAQK 188 

||:::: I : I I I : = I : : : : I 
Db 386 PNAKNFKIYEVTDQNQFVDSFTPDTSKLKDVTDQF DVIYSNDNKTATVD LMKGQT 440 



Qy 189 SISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEMMSQAMIRG 248 

| : : I : I : I I I I ' I I I : : I 

Db 441 SSNKQYIIQQV AYPDNSSTDNGKIDYTLDTDKTKYSWSN SYSNVNG 486 

Qy 249 MMDEEGNQFVAYFL PVEETLKKRKRDQEE EMD 280 

|:| I I 1:1 I hi I hi 

Db 487 SSTANGDQ-KKYNLGDYVWEDTNKDGKQDANEKGIKGVYVILKDSNGKELDRTTTDENGK 545 

Ov 281 YAP DDVYD YKI 291 

Y I I II I II 

Db 546 YQFTGLSNGTYSVEFSTPAGYTPTTANVGTDDAVDSDGLTTTGVI KDADNMTLDSGFYKT 605 



Qy 



292 AR EYNWNVKNK ASKGY EENYFFIFREGDGV 321 

: : I I I I II : I I : I I 

Db 606 PKYSLGDYVWYDSNKDGKRDSTEKGIKGVKVTLQNEKGEVIGTTETDENGKYRFDNLDSG 665 

Qy 322 YYNELETRVRLSKRRAKAGV-QSGTNALLVVKHRDMNEKELEAQEARKAQLEN-HEPEEE 379 

| : | | I I : I : I I I | | I : : : I : I : I I 

Db 666 KY KVIFEK PAGLTQTGTNTTEDDKDADGGEVDVTITDHDDFTLDNGYYEEET 717 

Qy 380 EEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESEREEGDRDEASDKSGSGEDES 439 

: : : : : I I : : I I : : I | I : I : : : : : I I I 

Db 718 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSNSDSDSDSDSD 777 

Qy 440 SEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDNDSDSGSNGGGQRSRSHSRS 499 

| : : : : | : ||:||: |:||l : : II: : I I I I 

Db 778 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD-SDSDSDS 836 

Qy 500 ASPFPSGSEHSAQEDGSEAAASDS-SEADSDSD 531 

| ||: : I : I I I I I I I I I 
Db 837 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 869 



RESULT 9 

US-09-200-650E-1 

Sequence 1, Application US/09200650E 
Patent No. 6680195 
GENERAL INFORMATION: 
APPLICANT: Patti, Joseph M. 
APPLICANT: Foster, Timothy J. 
APPLICANT: Hook, Magnus A.O. 
APPLICANT: Eidhinn, Deirdre Ni 
APPLICANT: Perkins, Samuel L. 

TITLE OF INVENTION: Extracellular Matrix-Binding Proteins from Staphylococcus 
aureus 

FILE REFERENCE: P06283US2/BAS 
CURRENT APPLICATION NUMBER: US/09/200, 650E 
CURRENT FILING DATE: 1998-11-25 
PRIOR APPLICATION NUMBER: 60/066,815 
PRIOR FILING DATE: 1997-11-26 
PRIOR APPLICATION NUMBER: 60/098,427 
PRIOR FILING DATE: 1998-08-31 
NUMBER OF SEQ ID NOS : 23 
SOFTWARE: Pa tent In Ver. 2.0 
SEQ ID NO 1 
LENGTH: 918 
TYPE: PRT 

ORGANISM: Staphylococcus aureus 
US-09-200-650E-1 

Query Match 8.2%; Score 226; DB 2; Length 918; 

Best Local Similarity 22.6%; Pred. No. 1.8e-10; 

Matches 125; Conservative 75; Mismatches 208; Indels 146; Gaps 23; 

Qy 33 VKYCNSLPDIPF-DPKFITYPFDQNRFVQYKATSLEKQHKHDLLTEPDLGVTIDLIN 88 

| | | | : I I I I I I I I : I : I I : I I : I 

276 VDYSNSNNTMPIADIK STNGDWAKAT YD I LT KT YT FVFT D YVNNKE 322 



Db 

Qy 



89 PDTYRIDPNVLLDPADEKLLEE EIQAPTSSKRS 121 

I : I I : : i I I : | | : I 

Db 323 NINGQFSLPLFTDRAKAPKSGTYDANINI — ADEMFNNKITYNYSSPIAGIDKPNGANIS 380 



Qy 122 QQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE- 180 

| | : | I I III | : :::| :: :::| : 

D b 381 S Q 1 1 GVDTAS GQNT YKQTVF VNPKQRVLGNTWVYIKGYQDKI-EESSGKVSATDT 434 

Qy 181 — KT FE — DAQKS I SQH YS KPRVT PV-EVMPVFPDFKMWINPCAQVI FDS DPAPKDT S GA 235 

: | | I I : | : | : : I I I : : : I I I 
Db 435 KLRIFEVNDTSKLSDSYYADPNDSNLKEVTDQFKNRIYYEHPNVASIKFGD 485 

Qy 236 AALEMMSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREY 295 

: :: I I I I : I : : I II I : 
Db 486 — ITKTYWLVEGHYDNTG KNLKTQVIQENVDPVTNRDYSI F 525 

Qy 296 NWNWNKASKGYEENYFFIFREGDGWYNELE 355 

|| :| : I I I I :| 
Db 526 GWNNEN WRYG GGSADGDSA 545 

Qy 356 MNEKE LEAQEARKAQLE NHEPEEEEEEEMETEEKEAGGSDEEQEK 400 



: I I : : : : : : I : I I I : : : : : III: 

Db 546 VNPKDPTPGPPVDPEPSPDPEPEPTPDPEPSPDPEPEPSPDPDPDSDSDSDSGSDSDSGS 605 

Qy 401 GSSSEKEGSEDEHSGSESERE-EGDRDEASDK-SGSGEDESSEDEARAARDKEEIFGSDA 458 

III: I I |:|: : I I I II I I I I — — I : M: 
Db 606 DSDSESDSDSDSDSDSDSDSDSESDSDSESDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 665 

Qy 459 DSEDDADSDDEDRGQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEA 518 

I I : : : I I I I : : I I : I I I I I : I I I I I I I : : I 

Db 666 DSDSESDSDSESDSESDSDSDSDSDSDSDSDSD-SDSDSDSDSDSDSDSDSDSDSDSDSD 724 

Qy 519 AASDS-SEADSDSD 531 

: III I::||lll 
Db 725 SDSDSDSDSDSDSD 738 



RESULT 10 
US-08-293-728-2 

Sequence 2, Application US/08293728D 
Patent No. 6008341 
GENERAL INFORMATION: 
APPLICANT: Foster , Timothy J. 
APPLICANT: McDevitt, Damien L. 

TITLE OF INVENTION: The S. aureus Fibrinogen Binding Protein Gene 
FILE REFERENCE: 05344.105011 
CURRENT APPLICATION NUMBER: US/08/293, 728D 
CURRENT FILING DATE: 1994-08-22 
NUMBER OF SEQ ID NOS : 20 
SOFTWARE: PatentlnVer. 2.0 
SEQ ID NO 2 
LENGTH: 933 
TYPE: PRT 

ORGANISM: Staphylococcus aureus 
US-08-293-728-2 

Query Match 8.2%; Score 225.5; DB 2; Length 933; 

Best Local Similarity 21.4%; Pred. No. 2.1e-10; 

Matches 103; Conservative 76; Mismatches 187; Indels 115; Gaps 18; 

TIDLINP — DTYRIDPNVLLDPADEKLLEEEIQAPTSSKRSQQHAKVVPWMRKTEYISTE 140 
MM: : I I I : : : | : : : : II : 

TIDQIDKTNNTYR — QTIYVNPSGDNVI APVLT 413 

FNRYGISNEKPEVKIGVSVKQQFTEEEIYK DRDSQITAI EKTFEDAQKS I SQHYS 195 

Ml : I I I ::l I I : I I I : 
GNLKPNTDSNALIDQQNTSIKVYKVDNAADLSES YFVNPENFEDVTNSVNITFP 4 67 

KP RVT P VEVMPVFP D FKMWI N P CAQVI FDS D PAP KDT S GAAALEMM SQAMIRGM- 249 

| || ||:: I II :: I I I I : I I 

NPNQYKVEFNT— PDDQITTPYIVWNGHIDP NSKGDLALRSTLYGYNSNI IWRSMS 522 

MDEEGNQFVAYFL PV^ETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVKNKA 303 

|| ||: : : : : : | : | : : I : I : : 

WDNE VAFNNGSGSGDGIDKPWPEQPDEPGEIEPIPED SDS 563 

SKGYEENYFFI FREGDGVYYNELETRVRLSKRRAKAGVQSGTNALL WK 352 

I : | : : | I I ::: 



Qy 


83 


Db 


383 


Qy 


141 


Db 


414 


Qy 


196 


Db 


468 


Qy 


250 


Db 


523 


Qy 


304 



Db 



564 DPGSDSG 



SDSNSDSGSDSGSDSTSDSGSDSASDSDSAS 601 



0 353 HRDMNEKELEAQE1ARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDE 412 

Y | I :: I ::::::::: 1 1 : 1 J_i* * 1 

Db 602 DSDS. 



ASDSDSASDSDSASDSDSDNDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 661 



Ov 413 HSGSESERE-EGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDR 471 

Y | |: |: : : I : :: : I : Ihlh hill = 

Db 662 DSDSDSDSDSDSDSDSDSD-SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 720 

Ov 472 GQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDS-SEADSDS 530 

y : | | : | I I I h I I I I I I I : : I : I I I I h : I I I I 

Db 721 SDSDSDSDSDSDSDSDSDSD-SDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSDS 779 

Qy 531 D 531 
I 

Db 780 D 780 



RESULT 11 
US-09-421-868-2 

Sequence 2, Application US/09421868 
Patent No. 6177084 
GENERAL INFORMATION: 
APPLICANT: Foster, Timothy J. 
APPLICANT: McDevitt, Damien L. 

TITLE OF INVENTION: The S. aureus Fibrinogen Binding Protein Gene 
FILE REFERENCE: 05344.105011 
CURRENT APPLICATION NUMBER: US/ 09/ 421, 868 
CURRENT FILING DATE: 1999-10-19 
PRIOR APPLICATION NUMBER: 08/293,728 
PRIOR FILING DATE: 1994-08-22 
NUMBER OF SEQ ID NOS : 20 
SOFTWARE: Patentln Ver. 2.0 
SEQ ID NO 2 
LENGTH: 933 
TYPE: PRT 

ORGANISM: Staphylococcus aureus 
US-09-421-868-2 

Query Match 8.2%; Score 225.5; DB 2; Length 933; 

Best Local Similarity 21.4%; Pred. No. 2.1e-10; 

Matches 103; Conservative 76; Mismatches 187; Indels 115; Gaps 18; 



83 TIDLINP— DTYRIDPNVLLDPADEKLLEEEIQAPTSSKRSQQHAKWPWMRKTEYISTE 140 
III: : I I I : : : I : : : : I I 



QY 

III h : I II :::!:::: M s 
D b 383 T I DQI DKTNNT YR — QT I YVN P S GDNVI APVLT ~"~ ~ ^ ' 



QV 141 FNRYGI SNEKPEVKIGVSVKQQFTEEEIYK DRDSQITAIEKTFEDAQKSISQHYS 195 

III : | | I : : I I I : III h* = 

Db 414 GNLKPNTDSNALIDQQNTSIKVYKVDNAADLSESYEWPENFEDVTNSVNITFP 467 

Q 196 k P RVT P VEVMP VF P D FKMW I N P CAQVI FD S D PAP KDT S GAAALEMM SQAMIRGM- 249 

| || || :: I II :: I M I : I I 

Db 468 NPNQYKVEFNT — PDDQITTPYIVWNGHIDP NSKGDLALRSTLYGYNSNIIWRSMS 522 

Qy 250 MDEEGNQFVAYFL PVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVKNKA 303 



Db 


523 


|| | | : : : : . i • i • • i • i 


563 


Qy 


304 


S KG YEEN Y FFI FREGDGVY YN ELET RVRL S KRRAKAGVQ S GTNALL - WK 
■ I • • 1 1 1 • • • 


352 


Db 


564 


| : I • • i i i • • • 


601 


Qy 


353 


HRDMNEKELEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDE 
DSDSASDSDSASDSDSASDSDSDNDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 


412 


Db 


602 


661 


Qy 


413 


HSGSESERE-EGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDR 

| | : | : : : I 1 till 1 I : : : : 1 : 1 1 : 1 1 : 1 • 1 1 1 • 
DSDSDSDSDSDSDSDSDSD-SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 


471 


Db 


662 


720 


Qy 


472 


GQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDS-SEADSDS 

: ||:||MI: 1 1 1 1 1 1 h : 1 : HM hUMI 
SDSDSDSDSDSDSDSDSDSD-SDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSDS 


530 


Db 


721 


779 


Qy 


531 


D 531 




Db 


780 


1 

D 780 





RESULT 12 

US-08-956-171E-5249 

; Sequence 5249, Application US/08956171E 
; Patent No. 6593114 

GENERAL INFORMATION: 

APPLICANT: Charles Kunsch 
Gil H. Choi 
Patrick S. Dillon 
Craig A. Rosen 
; Steven C. Barash 

; Michael R. Fannon 

TITLE OF INVENTION: Staphylococcus aureus Polynucleotides and Sequences 
; NUMBER OF SEQUENCES: 5256 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Human Genome Sciences , Inc. 

STREET: 9410 Key West Avenue 
; CITY: Rockville 

; STATE: Maryland 

; COUNTRY: USA 

ZIP: 20850 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette, 3.50 inch, 1.4Mb storage 
; COMPUTER: HP Vectra 486/33 

OPERATING SYSTEM: MSDOS version 6.2 
; SOFTWARE: ASCII Text 

; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/956, 171E 
FILING DATE: 20-Oct-1997 
CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: 60/009,861 

FILING DATE: January 5, 1996 
APPLICATION NUMBER: 08/781,986 



FILING DATE: January 3, 1997 
ATTORNEY/ AGENT INFORMATION: 
NAME: Mark J. Hyman 
REGISTRATION NUMBER: 46,789 
REFERENCE/ DOCKET NUMBER: PB248P1 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (240) 314-1224 
TELEFAX: (301) 309-8439 
INFORMATION FOR SEQ ID NO: 5249: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 936 amino acids 
TYPE: amino acid 
STRANDEDNESS: single 
TOPOLOGY: linear 
MOLECULE TYPE: protein 

SEQUENCE DESCRIPTION: SEQ ID NO: 5249: 
US-08-956-171E-5249 

Query Match 8.2%; Score 225.5; DB 2; Length 936; 

Best Local Similarity 21.4%; Pred. No. 2.1e-10; 

Matches 103; Conservative 76; Mismatches 187; Indels 115; Gaps 18; 

Qy 83 TIDLINP — DTYRIDPNVLLDP7VDEKLLEEEIQAPTSSKRSQQHAKWPWMRKTEYISTE 140 

III I : : II I : : : | : : : : | | : 
Db 392 TIDQIDKTNNTYR— QTIYVNPSGDNVI APVLT 422 

Qy 141 FNRYGISNEKPEVKIGVSVKQQFTEEEIYK DRDSQITAI EKTFEDAQKSI SQHYS 195 

III : I I I : : I I I : I I I I : : : 

Db 423 GNLKPNTDSNALIDQQNTSIKVTKVDNAADLSESYFVNPENFEDVTNSVNITFP 476 

Qy 196 KPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEMM SQAMIRGM- 249 

I II II:: I II : : I I I I : I I 

Db 477 NPNQYKVEFNT — PDDQITTPYIVWNGHIDP NSKGDLALRSTLYGYNSNIIWRSMS 531 

Qy 250 MDEEGNQFVAYFL PV^ETLKKRKRDQEEEMDYAPDDWDYKIAREYNWNVKNKA 303 

II I I : :::: : | : | : : | : | :: 

Db 532 WDNE VAFNNGSGSGDGIDKPWPEQPDEPGEIEPIPED SDS 572 

Qy 304 SKGYEENYFFI FREGDGVYYNELETRVRLSKRRAKAGVQSGTNALL WK 352 

I : | : :| | |::: 

Db 573 DPGSDSG SDSNSDSGSDSGSDSTSDSGSDSASDSDSAS 610 

Qy 353 HRDMNEKELEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDE 412 

I | :: | ::::::::: | | : : | |: : | 

Db 611 DSDSASDSDSASDSDSASDSDSDNDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 670 

Qy 413 HSGSESERE-EGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDR 471 

I I : I : : : I I II I I I I : : : : I : I I : I I : I : I I I : 
Db 671 DSDSDSDSDSDSDSDSDSD-SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 729 

Qy 472 GQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDS-SEADSDS 530 

: 11:11111: I I I I I I I : : I Mill |::||ll 

Db 730 SDSDSDSDSDSDSDSDSDSD-SDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSDS 788 



Qy 

Db 



531 D 531 
I 

789 D 789 



RESULT 13 

US-08-781-986A-5249 

; Sequence 5249, Application US/08781986A 
; Patent No. 6737248 
; GENERAL INFORMATION: 

APPLICANT: Charles Kunsch 
; TITLE OF INVENTION: Staphylococcus aureus Polynucleotides and Sequences 
NUMBER OF SEQUENCES: 5255 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Human Genome Sciences, Inc. 

STREET: 9410 Key West Avenue 

CITY: Rockville 
; STATE: Maryland 

COUNTRY: USA 
; ZIP: 20850 

COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette, 3.50 inch, 1.4Mb storage 

COMPUTER: HP Vectra 486/33 
; OPERATING SYSTEM: MSDOS version 6.2 

SOFTWARE: ASCII Text 
; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/781, 986A 
; FILING DATE: 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: 

; FILING DATE: 

ATTORNEY/ AGENT INFORMATION: 
; NAME: Benson, Bob 

REGISTRATION NUMBER: 30,446 

REFERENCE/ DOCKET NUMBER: PB248PP 
; TELECOMMUNICATION INFORMATION: 
; TELEPHONE: (301) 309-8504 

; TELEFAX: (301) 309-8512 

; INFORMATION FOR SEQ ID NO: 5249: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 936 amino acids 

; TYPE: amino acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-08-781-986A-5249 

Query Match 8.2%; Score 225.5; DB 2; Length 936; 

Best Local Similarity 21.4%; Pred. No. 2.1e-10; 

Matches 103; Conservative 76; Mismatches 187; Indels 115; Gaps 18; 
Qy 83 TIDLINP— DTYRIDPNVLLDPADEKLLEEEIQAPTSSKRSQQHAKWPWMRKTEYISTE 140 



Db 392 T I DQI DKTNNT YR — QT I YVN P S GDNVI APVLT 422 



Qy 

Db 



141 FNRYGISNEKPEVKIGVSVKQQFTEEEIYK DRDSQITAIEKTFEDAQKSISQHYS 195 

III : I I I : : I I I : I I I I : : : 

423 GNLKPNTDSNALIDQQNTSIKVYKVDNAADLSESYFVNPENFEDVTNSVNITFP 476 



Qy 196 KPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEMM SQAMIRGM- 249 

I II II:: I | | :: I | I I : I I 

Db 477 NPNQYKVEFNT--PDDQITTPYIWVNGHIDP NSKGDLALRSTLYGYNSNI IWRSMS 531 

Qy 250 MDEEGNQFVAYFL PVEETLKKRKRDQEEEMDYAPDDWDYKIAREYNWNVKNKA 303 

I I II: : : : : : I : I : : I : I : : 

Db 532 WDNE VAFNNGSGSGDGIDKPWPEQPDEPGEI EPI PED SDS 572 

Qy 304 SKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALL WK 352 

I : I : : I II::: 

Db 573 DPGSDSG SDSNSDSGSDSGSDSTSDSGSDSASDSDSAS 610 

Qy 353 HRDMNEKELEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDE 412 

I I :: I ::::::::: | |. : : | | : : I 

Db 611 DSDSASDSDSASDSDSASDSDSDNDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 670 

Qy 413 HSGSESERE-EGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDR 471 

I I : I : : : I I I I I I I I : : : : I : I I : I I : I : I I I : 
Db 671 DSDSDSDSDSDSDSDSDSD-SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 729 

Qy 472 GQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDS-SEADSDS 530 

: I I : I I I I I : I I I I I I h : I : I I I I I :: I I I I 

« Db 730 SDSDSDSDSDSDSDSDSDSD-SDSDSDSDSDSDSDSDSDSDSDSDSDSASDSDSDSDSDS 788 

Qy 531 D 531 

I 

Db 789 D 789 



RESULT 14 
US-09-200-650E-5 

Sequence 5, Application US/09200650E 
Patent No. 6680195 
GENERAL INFORMATION: 
APPLICANT: Patti, Joseph M. 
APPLICANT: Foster, Timothy J. 
APPLICANT: Hook, Magnus A.O. 
APPLICANT: Eidhinn, Deirdre Ni 
APPLICANT: Perkins, Samuel L. 

TITLE OF INVENTION: Extracellular Matrix- Binding Proteins from Staphylococcus 
aureus 

FILE REFERENCE: P062 83US2/BAS 
CURRENT APPLICATION NUMBER: US/09/200 , 650E 
CURRENT FILING DATE: 1998-11-25 
PRIOR APPLICATION NUMBER: 60/066,815 
PRIOR FILING DATE: 1997-11-26 
PRIOR APPLICATION NUMBER: 60/098,427 
PRIOR FILING DATE: 1998-08-31 
NUMBER OF SEQ ID NOS : 23 
SOFTWARE: Patentln Ver. 2.0 
SEQ ID NO 5 
LENGTH: 1315 
TYPE: PRT 

ORGANISM: Staphylococcus aureus 
US-09-200-650E-5 



Query Match 



8.1%; Score 223.5; DB 2; Length 1315; 



Best Local Similarity 21.7%; Pred. No. 5.1e-10; 

Matches 121; Conservative 85; Mismatches 225; Indels 127; Gaps 23; 

Qy 27 SGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQYKATSLEKQHKHDLLTEPDLGVTIDL 86 

: I I : I : : I I I : I III : I I I : I 

Db 772 TGVI NGADNMTLDSGF — YKTPKYNLGNYVWEDTNKDGKQDSTEKGISGVTVTL 823 

Qy 87 INPD TYRID PNVLLDPADEKLLEEEIQAP 115 

I : II::: I : I I : : : 

Db 824 KNENGEVLQTTKTDKDGKYQFTGLENGTYKVEFETPSGYTPTQVGSGTDEG-IDSNGTST 882 

Qy 116 TSSKRSQQHAKV VPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYK 170 

I : : : : I : I : : I : I : : : : I I : I I 
Db 883 TGVI KDKDNDTI DSGFYKPTYNLGDYVWEDTNKNGVQDKDEKGI SGVTV TLK 934 

Qy 171 DRDSQITAI EKTFEDAQKS I SQ HYSKPRVTPVEVMPVFPDFKMWINPCAQVI FDSD 226 

I : : : I I : : : I II I : : 

Db 935 DENDKVXKTVTTDENGKYQFTDLNNGTYKVEFETPSGYTPT SVTSGN 981 

Qy 227 PAP KDT S GAAAL EMMS QAMI RGMMD E EG NQFVAYFLPVEETLKKRKRDQEE 277 

I I : : I : : I I : I : I I : : I I : I I 

Db 982 DTEKDSNGLTTTGVIKDA — DNMTLDSGFYKTPKYSLGDYVWY DSNKDGKQDSTE 1034 

Qy 278 EMDYAPDDVYDYKIAREYNWNVKNK — ASKGYEENYFFIFREGDGVYYNELETRVRLSKR 335 

: : I I : I I : : : I I : I I I : I I 

Db 1035 K GIKDVKVTL LNEKGEVIGTTKTDENGKYCFDNLDSGKY KVIFEK- 1079 

Qy 336 RAKAGV-QSGTNALLWKHRDMNEKELEAQEARKAQLENHEPEEEEEEEMETEEKEAGGS 394 

I I : I : I I I |||:: : I : I III: I 
Db 1080 — PAGLTQTGTNTTEDDKDADGGEVDVTITDHDDFTLDNGYYEEETSD S 1126 

Qy 395 DEEQEKGSSSEKEGSEDEHSGSESEREEGDRDEASDKSGSGEDESSEDEARAARDKEEIF 454 

I : : II::: I I |:|: : I I II I I I |: :: : I : 
Db 1127 DSDSDSDSDSDRDSDSDSDSDSDSD-SDSDSDSDSD-SDSDSDRDSDSDSDSDSDSDSDS 1184 

Qy 455 GSDADSEDDADSDDEDRGQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQED 514 

I I : I I : I : I I I : : I I : I I I I I : I I I I II: : I 

Db 1185 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD SDSDSDSDSDSDSDSDSDSD 1237 

Qy 515 GSEAAASDS-SEADSDSD 531 

: III I :: I I I I I 
Db 1238 SDSDSDSDSDSDSDSDSD 1255 



RESULT 15 

US-09-949-016-10366 

; Sequence 10366, Application US/09949016 

; Patent No. 6812339 

; GENERAL INFORMATION: 

; APPLICANT: VENTER, J. Craig et al . 

; TITLE OF INVENTION: POLYMORPHISMS IN KNOWN GENES ASSOCIATED 

; TITLE OF INVENTION: WITH HUMAN DISEASE, METHODS OF DETECTION AND USES 

THEREOF 

; FILE REFERENCE: CL001307 

; CURRENT APPLICATION NUMBER: US/09/949,016 

; CURRENT FILING DATE: 2000-04-14 

; PRIOR APPLICATION NUMBER: 60/241,755 



; PRIOR FILING DATE: 2000-10-20 

; PRIOR APPLICATION NUMBER: 60/237,768 

; PRIOR FILING DATE: 2000-10-03 

; PRIOR APPLICATION NUMBER: 60/231,498 

; PRIOR FILING DATE: 2000-09-08 

; NUMBER OF SEQ ID NOS : 207012 

; SOFTWARE: FastSEQ for Windows Version 4.0 

; SEQ ID NO 10366 

LENGTH: 1259 

TYPE: PRT 

ORGANISM: Human 
US-09-949-016-10366 

Query Match 8.0%; Score 220.5; DB 2; Length 1259; 

Best Local Similarity 23.1%; Pred. No. 8.6e-10; 

Matches 93; Conservative 57; Mismatches 160; Indels 93; Gaps 16; 

Qy 171 DRDSQITAIEKTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPK 230 

I I I I I I I : I i : I : I I I I 

Db 297 DHDSSI GQNSDSKEYYDPEGKE -DPHNEV — DGDKTSK 331 

Qy 231 DTSGAAALEMMSQAMIRGMMDEEGNQFVAYFLPVEETLK KR KRDQEEE 278 

: I | : : : | : | : I : I I | | | : 

Db 332 SEENSA GIPEDNGSQ RIEDTQKLNHRESKRVENRITKESETHA 374 

Qy 279 MDYAPDDWDYKIAREYNWNVKNKASKGYE ENYFFI FREGDGVYYNE LETRV 330 

: : I : I I I : : I I I : I : I : I : I 

Db 375 VGKSQDKGIEIKGPSSGNRNITKEVGKGNEGKEDKGQHGMILGKGNVKTQGEWNIEGPG 434 

Qy 331 RLSKRRAKAGVQSGTNALLWKHRDMNEKELEAQEARKAQLENHEPEEEEE E 382 

: I: Mil: II :: : :: :| :| 

Db 435 QKSEPGNKVG-HSNTGS DSNSDGYDSYDFDDKSMQGDDPNSSDESNGNDDANS 486 

Qy 383 EMETEEKEAG GSDEEQEKGSSSEKEGSEDEHSGSESEREEGDR DEASDK 431 

I : I | | | :: | : | : : | : M : I I I : I :: : I I 

Db 487 ESDNNSSSRGDASYNSDESKDNGNGSDSKGAEDDDSDSTSDTNNSDSNGNGNNGNDDNDK 546 

Qy 432 SGSGE DESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDNDSDSGSNG 488 

III: II I : :: : I I : I I I I I I : : I II I |: 

Db 547 SDSGKGKSDSSDSDSSDSSNSSDSSDSSDSDSSDSNSSSDSDSSDSDSSDSSDSDS-SDS 605 

Qy 489 GGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 

II : I I : I : I : I I I I : : I I I I 

Db 606 SNSSDSSDSSDSSDSSDSSDSSDSKSDSSKSESDSSDSDSKSD 648 



Search completed: April 25, 2006, 09:12:43 
Job time : 48 sees 



GenCore version 5.1.7 
Copyright (c) 1993 - 2006 Biocceleration Ltd. 



OM protein - protein search, using sw model 
Run on: April 25, 2006, 09:22:24 



Search time 164 Seconds 

(without alignments) 

1352.850 Million cell updates/sec 



Title: 

Perfect score: 
Sequence: 

Scoring table: 



US-10-721-553-2 
2764 

1 MAPTIQTQAQREDGHRPNSH. 



. QEDGSEAAASDSSEADSDSD 531 



BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 1867569 seqs, 417829326 residues 

Total number of hits satisfying chosen parameters: 



1867569 



Minimum DB seq length: 
Maximum DB seq length: 



2000000000 



Post-processing: 



Database 



Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 

Published_Applications_AA_Main : 



/cgn2_6/ptodata/l/pubpaa/US07_PUBCOMB.pep:* 
/cgn2_6/ptodata/l/pubpaa/US08_PUBCOMB.pep:* 
/cgn2_6/ptodata/l/pubpaa/US09_PUBCOMB.pep:* 
/cgn2_6/ptodata/l/pubpaa/US10A_PUBCOMB.pep: * 
/cgn2_6/ptodata/l/pubpaa/US10B_PUBCOMB.pep:* 
/ cgn2_6/ptodata/ 1/pubpaa/US 1 l_PUBCOMB . pep : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 
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Query 
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337.5 


12.2 
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8 . 


6 
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US-10-690-184-4 


Sequence 


4, Appli 


16 


234 . 5 


8 . 


5 


1742 


4 
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3 
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3 
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4 
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Sequence 
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38 
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8. 
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Sequence 
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39 


221.5 
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0 


1349 


3 
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Sequence 
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220.5 


8 . 
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Sequence 
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41 


220 
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5 


US-10-470-048B-58 


Sequence 


58, Appl 


42 
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7. 


9 
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3 


US-09-815-242-5471 


Sequence 
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43 
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7. 


9 
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3 
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Sequence 


12544, A 


44 
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7. 


9 
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45 
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7. 


9 
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5 


US-10-744-672-7 


Sequence 


7, Appli 



ALIGNMENTS 



RESULT 1 
US-09-840-787-9 

; Sequence 9, Application US/09840787 
; Patent No. US20020058264A1 
GENERAL INFORMATION: 

APPLICANT: Lai, Preeti 
; Hillman, Jennifer L. 

; Bandman, Olga 

; Shah, Purvi 

; Au-Young, Janice 

; Yue, Henry 

; Guegler, Karl J. 

; Corley, Neil C. 

TITLE OF INVENTION: HUMAN REGULATORY MOLECULES 

NUMBER OF SEQUENCES: 98 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Incyte Pharmaceuticals, Inc. 

; STREET: 3174 Porter Drive 



; CITY: Palo Alto 

STATE: CA 
COUNTRY: USA 
ZIP: 94304 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
; SOFTWARE: FastSEQ for Windows Version 2.0 

CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/09/840,787 

; FILING DATE: 23-Apr-2001 

CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 09/518,865 
FILING DATE: <Unknown> 
ATTORNEY/AGENT INFORMATION: 
; NAME: Billings, Lucy J. 

REGISTRATION NUMBER: 36,749 
; REFERENCE/ DOCKET NUMBER: PF-0356 US 

TELECOMMUNICATION INFORMATION: 
TELEPHONE: 415-855-0555 
; TELEFAX: 415-845-4166 

; TELEX : <Unknown> 

INFORMATION FOR SEQ ID NO: 9: 
SEQUENCE CHARACTERISTICS: 
; LENGTH: 531 amino acids 

; TYPE: amino acid 

STRANDEDNESS: single 
; TOPOLOGY: linear 

; IMMEDIATE SOURCE: 

LIBRARY: PITUNOR01 
CLONE: 98974 
SEQUENCE DESCRIPTION: SEQ ID NO: 9 : 
US-09-840-787-9 



Query Match 100.0%; Score 2764; DB 3; Length 531; 

Best Local Similarity 100.0%; Pred. No. le-166; 

Matches 531; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


i 


MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 


60 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


i 


MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 


60 


Qy 


61 


YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 


120 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 




Db 


61 


YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 


120 


Qy 


121 


SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 


180 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 II 1 1 




Db 


121 


SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 


180 


Qy 


181 


KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVI FDSDPAPKDTSGAAALEM 


240 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 




Db 


181 


KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVI FDSDPAPKDTSGAAALEM 


240 


Qy 


241 


MSQAMIRGMMDEEGNQFVAYFLPV^ETLKKRKRDQEEEMDYAPDDVTDYKIAREYNWNVK 


300 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 



Db 


241 


MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 


300 


Qy 


301 


NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 


360 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 




Db 


301 


NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 


360 


QV 


361 


LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 


420 






1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


361 


LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 


420 


Qy 


421 


EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 


480 






1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 




Db 


421 


EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 


480 


Qy 


481 


DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 








1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


481 


DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 




RESULT 2 








US-10-721- 


-553- 


-2 





Sequence 2, Application US/10721553 
Publication No. US20050032079A1 



; GENERAL INFORMATION: 

; APPLICANT: Batra, Surinder K. 

; APPLICANT: Hollingsworth, Michael A. 

; APPLICANT: University of Nebraska Board of Regents 

; TITLE OF INVENTION: Novel Gene That is Amplified and 

; TITLE OF INVENTION: Overexpressed in Cancer and Methods of Use Thereof 
; FILE REFERENCE: UNMC63121 

; CURRENT APPLICATION NUMBER: US/10/721,553 

; CURRENT FILING DATE: 2003-11-25 

; PRIOR APPLICATION NUMBER: US/09/647 , 143 

; PRIOR FILING DATE: 2000-09-27 

; PRIOR APPLICATION NUMBER: PCT/US99/06633 

; PRIOR FILING DATE: 1999-03-26 

; PRIOR APPLICATION NUMBER: 60/079,649 

; PRIOR FILING DATE: 1998-03-27 

; NUMBER OF SEQ ID NOS : 22 

; SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 2 

LENGTH: 531 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-10-721-553-2 

Query Match 100.0%; Score 2764; DB 5; Length 531; 

Best Local Similarity 100.0%; Pred. No. le-166; 

Matches 531; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

Qy 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

I I I I I I I I II I I I I II I I I II I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 



Db 


61 


YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 


120 


Qy 


121 


SQQHAKWPWMRKTEYI STEFNRYGI SNEKPEVKI GVSVKQQFTEEEI YKDRDSQITAI E 


ion 

loU 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 M 1 1 1 1 1 1 1 1 




Db 


121 


SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 


180 


Qy 


181 


KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVI FDSDPAPKDTSGAAALEM 


240 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVI FDSDPAPKDTSGAAALEM 


240 


Qy 


241 


MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDWDYKIAREYNWNVK 


300 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


241 


MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEmDYAPDDVYDYKIAREYNWNVK 


300 


Qy 


301 


NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 


360 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


301 


NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLVVKHRDMNEKE 


360 


Qy 


361 


LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 


420 




1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


361 


LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 


420 


Qy 


421 


EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 


480 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


421 


EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDN 


480 


Qy 


481 


DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 








1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


481 


DSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 




RESULT 3 








US-10-450- 


-763- 


-50041 





Sequence 50041, Application US/10450763 
Publication No. US20050196754A1 



GENERAL INFORMATION: 
APPLICANT: Hyseq, Inc 

TITLE OF INVENTION: NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 
FILE REFERENCE: 790CIP3/US 

CURRENT APPLICATION NUMBER: US/10/450, 763 
CURRENT FILING DATE: 2003-06-11 
PRIOR APPLICATION NUMBER: PCT/US01/ 08631 
PRIOR FILING DATE: 2001-03-30 
PRIOR APPLICATION NUMBER: 09/540,217 
PRIOR FILING DATE: 2000-03-31 
PRIOR APPLICATION NUMBER: 09/649,167 
PRIOR FILING DATE: 2000-08-23 
NUMBER OF SEQ ID NOS : 60736 
SOFTWARE: Custom 
SEQ ID NO 50041 

LENGTH: 553 

TYPE: PRT 

ORGANISM: Homo sapiens 
FEATURE : 

NAME/KEY: DOMAIN 
LOCATION: (377) . . (428) 



; OTHER INFORMATION: Neuromociulin (GAP-43) proteins domain identified by 
eMATRIX, 

; OTHER INFORMATION: accession number BL00412D, p-value=9 . 633e-09, raw score 
of 16.54 

FEATURE: 
; NAME /KEY: misc_feature 

LOCATION: (1) . . - (553) 
; OTHER INFORMATION: Xaa = X or * as defined in Table 2 
US-10-450-763-50041 



Query Match 96.2%; Score 2658.5; DB 5; Length 553; 

Best Local Similarity 95.0%; Pred. No. 5e-160; 

Matches 515; Conservative 6; Mismatches 10; Indels 11; Gaps 1; 



Qy 


1 


MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 


60 




1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


12 


MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 


71 


Qy 


61 


YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 


120 




1 I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 II 1 1 1 1 M II 1 1 II 1 1 1 




Db 


72 


YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 


131 


Qy 


121 


SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 


180 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


132 


SQQHAKWPWMRKTEYISTEFNRYCIFHEKPEVKKWGSVKQQFTEEEIYKDRDSQITAIE 


191 


Qy 


181 


KTFEDAQKS ISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAP 


229 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


192 


KTFEDAQKSVIEGLGWGEARISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAP 


251 


Qy 


230 


KDTSGAAALEMMSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDY 


289 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
i i i i i i i i i i i i i i 1 i i i i i i i i i i i i i i i i i i i i i i i i i i ■ i i i j i i i i i i i i ■ i ■ ■ ■ 1 




Db 


252 


KDT S GAAALEMMSQAMI RGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMD YAPDDVYD Y 


311 


Qy 


290 


KIAREYNWNVT^KASKGYEENYFFIFREGDGVTYNELETRvT^LSKRRAKAGVQS 


349 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


312 


KIAREYNWNVICNKASKGYEENYFFIFREGDGWYNELETRVTILSKRRAKAGVQSGTNALL 


371 


Qy 


350 


WKHRDMNEKELEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGS 


409 






I 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


372 


WKHRDMNEKELEAQETRKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGS 


431 


Qy 


410 


EDEHSGSESEREEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDE 


469 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 : II : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


432 


EDEHSGSESEREEGDRHEASDKSGSGQDDSSDYXARAARDKEEIFGSDADSEDDADSDDE 


491 


Qy 


470 


DRGQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSD 


529 






I 1 I I 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 M 1 1 1 




Db 


492 


DRGQAQGGSDNDSDSGRNGGGQRTRSHSRSASPFPSGSEHSAQENGSEAAASDSSEADSD 


551 


Qy 


530 


SD 531 




Db 


552 


1 1 

SD 553 





RESULT 4 

US-09-986-480-410 



; Sequence 410, Application US/09986480 

; Publication No. US20030027999A1 

; GENERAL INFORMATION: 

; APPLICANT: Rosen et al. 

; TITLE OF INVENTION: 143 Human Secreted Proteins 
; FILE REFERENCE: PS500P1 

; CURRENT APPLICATION NUMBER: US/09/986, 480 

; CURRENT FILING DATE: 2001-11-08 

; PRIOR APPLICATION NUMBER: PCT/US00/12788 

; PRIOR FILING DATE: 2000-05-11 

; PRIOR APPLICATION NUMBER: US 60/134,068 

; PRIOR FILING DATE: 1999-05-13 

; NUMBER OF SEQ ID NOS : 456 

; SOFTWARE: Patentln Ver. 2.0 

; SEQ ID NO 410 

; LENGTH: 473 

; TYPE: PRT 

; ORGANISM: Homo sapiens 

FEATURE: 

NAME/ KEY: SITE 
; LOCATION: (405) 

; OTHER INFORMATION: Xaa equals any of the naturally occurring L-amino acids 
US-09-986-480-410 



Query Match 89.1%; Score 2464; DB 3; Length 473; 

Best Local Similarity 99.8%; Pred. No. 8.5e-148; 

Matches 472; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 


i 


MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 


60 






1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


i 


MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 


60 


Qy 


61 


YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 


120 






I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


61 


YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 


120 


Qy 


121 


SQQHAKWPWMRKTEYI STEFNRYGI SNEKPEVKI GVSVKQQFTEEEI YKDRDSQITAI E 


180 




1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 




Db 


121 


SQQHAKVVPWMRKTEYI STEFNRYGI SNEKPEVKI GVSVKQQFTEEEI YKDRDSQITAI E 


180 


Qy 


181 


KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVI FDSDPAPKDTSGAAALEM 


240 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVI FDSDPAPKDTSGAAALEM 


240 


Qy 


241 


MSQAMIRGMMDEEGNQFVAYFLPVTlETLKKRKRDQEEEMDYAPDDvTDYKIAREYNWNVK 


300 






1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


241 


MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 


300 


Qy 


301 


NKASKGYEENYFFIFREGDGWYNELETRVRLSKRRAKAGVQSGTNALLVVKHRDMNEKE 


360 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 II 1 




Db 


301 


NKASKGYEENYFFIFREGDGVTYNELETRVRLSKRRAKAGVQSGTNALLVVKHRDMNEKE 


360 


Qy 


361 


LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 


420 






1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 




Db 


361 


LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSXKEGSEDEHSGSESER 


420 


Qy 


421 


EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQ 473 





Db 



I i I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I M 1 I I I I II I I I I 
421 EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQ 473 



RESULT 5 

US-10-863-332-410 

; Sequence 410, Application US/10863332 

; Publication No. US20050064458A1 

; GENERAL INFORMATION: 

; APPLICANT: Rosen et al . 

; TITLE OF INVENTION: 143 Human Secreted Proteins 
; FILE REFERENCE: PS500P1 

; CURRENT APPLICATION NUMBER: US/10/863,332 

; CURRENT FILING DATE: 2004-06-09 

; PRIOR APPLICATION NUMBER: US/09/986,480 

; PRIOR FILING DATE: 2001-11-08 

; PRIOR APPLICATION NUMBER: PCT/US00/12788 

; PRIOR FILING DATE: 2000-05-11 

; PRIOR APPLICATION NUMBER: US 60/134,068 

; PRIOR FILING DATE: 1999-05-13 

; NUMBER OF SEQ ID NOS : 456 

SOFTWARE: Patentln Ver. 2.0 
; SEQ ID NO 410 
; LENGTH: 473 
; TYPE: PRT 

; ORGANISM: Homo sapiens 

FEATURE: 

NAME/KEY: SITE 
; LOCATION: (405) 

; OTHER INFORMATION: Xaa equals any of the naturally occurring L-amino acids 
US-10-863-332-410 



Query Match 89.1%; Score 2464; DB 5; Length 473; 

Best Local Similarity 99.8%; Pred. No. 8.5e-148; 

Matches 472; Conservative 0; Mismatches 1; Indels 0; Gaps 



Qy 
Db 


i 
i 


MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 

| | | | | | I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 
MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 


60 
60 


Qy 


61 


YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 

MM 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 t 1 1 1 1 1 1 1 1 1 1 1 1 1 

YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 


120 


Db 


61 


120 


Qy 


121 


SQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKDRDSQITAIE 
M 1 1 II M II II 1 1 1 II II M 1 1 1 II 1 II II II 1 II M II M II 1 II 1 1 M II II 1 II II 

SQQHAKWPWMRKTEYI STEFNRYGI SNEKPEVKI GVSVKQQFTEEEI YKDRDSQITAI E 


180 


Db 


121 


180 


Qy 


181 


KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALEM 

M II 1 1 M II II 1 II II 1 M M 1 1 1 1 1 II 1 1 1 M II 1 1 II M 1 II II II 1 1 1 1 1 II II II 
KTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVI FDSDPAPKDTSGAAALEM 


240 


Db 


181 


240 


Qy 


241 


MSQAMIRGMMDEEGNQEVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 

M 1 1 1 II M 1 II II M 1 1 M II 1 1 1 1 1 M 1 1 II 1 II 1 II 1 1 1 1 II 1 II 1 1 1 1 II M 1 1 1 1 
MSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNVK 


300 


Db 


241 


300 


Qy 


301 


NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 


360 



I I I I I I I I I I I I I I I II I I i I I I I I I I I I 1 I I I I I I I I I I I I M M M I M I I I I I I I I I 

Db 301 NKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEKE 360 

Qy 361 LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESER 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I 

Db 361 LEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSXKEGSEDEHSGSESER 420 

Qy 421 EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQ 473 

till | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 EEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQ 473 



RESULT 6 

US-11-097-143-4281 

; Sequence 4281, Application US/11097143 

; Publication No. US20050208558A1 

; GENERAL INFORMATION: 

; APPLICANT: Venter, J. Craig 

; APPLICANT: et al . 

; TITLE OF INVENTION: DETECTION KIT, SUCH AS NUCLEIC ACID 

; TITLE OF INVENTION: ARRAYS, FOR DETECTING EXPRESSION OF 10,000 OR MORE 
; TITLE OF INVENTION: DROSOPHILA GENES. 
; FILE REFERENCE: CL000728 

; CURRENT APPLICATION NUMBER: US/11/097,143 

; CURRENT FILING DATE: 2005-04-04 

; PRIOR APPLICATION NUMBER: 60/157,832 

; PRIOR FILING DATE: 1999-10-05 

; PRIOR APPLICATION NUMBER: 60/160,191 

; PRIOR FILING DATE: 1999-10-19 

; PRIOR APPLICATION NUMBER: 60/161,932 

; PRIOR FILING DATE: 1999-10-28 

; PRIOR APPLICATION NUMBER: 60/164,769 

; PRIOR FILING DATE: 1999-11-12 

; PRIOR APPLICATION NUMBER: 60/173,383 

; PRIOR FILING DATE: 1999-12-28 

; PRIOR APPLICATION NUMBER: 60/175,693 

; PRIOR FILING DATE: 2000-01-12 

; PRIOR APPLICATION NUMBER: 60/184,831 

; PRIOR FILING DATE: 2000-02-24 

; PRIOR APPLICATION NUMBER: 60/191,637 

; PRIOR FILING DATE: 2000-03-23 

; NUMBER OF SEQ ID NOS : 43008 

; SOFTWARE: FastSEQ for Windows Version 4.0 

; SEQ ID NO 4281 

LENGTH: 538 
; TYPE: PRT 

ORGANISM: DROSOPHILA 
US-11-097-143-4281 

Query Match 45.0%; Score 1244.5; DB 6; Length 538; 

Best Local Similarity 50.0%; Pred. No. 1.5e-70; 

Matches 271; Conservative 66; Mismatches 172; Indels 33; Gaps 1 

1 MAPTIQTQAQREDGHRPNSHRTLPERSGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQ 60 

| | | | | : | : | :: | | I I I I : I I I I I I I I I : II I I : I I I I 

1 MPPTINNSAVNSAAEK-RPQRQTERKSEIICRVKYGNNLPDIPFDLKFLQYPFDSHRFVQ 59 



QY 
Db 



Oy 61 YKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKR 120 

| Mil: 1:1:111 I I Mhlll I : I: I I I I I I I I I I I I I I II I I 
Db 60 YNPTSLERNFKYDVLTEHDLGVTVDLINRELYQADSMTLLDPADEKLLEEETLTPTDSVR 119 

Qy 121 SQQHAKWPWMRKTEYISTEFNRYGISN-EKPEVKIGVSVKQQFTEEEIYKDRDSQITAI 179 

I : I I : : I I : I I : I I I I I I I : I I I I : I : I I : I I : I I I :: I I I I 
Db 120 SRQHSRTVSWLRKSEYI STEQTRFQPQNLENI EAKVGYNVKKSLREETLYLDREAQI KAI 179 

Qy 180 EKTFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDPAPKDTSGAAALE 239 

Mill: I : : M M I I M M : I : I I I I I I M I I I II I I I I I : I I I 
Db 180 EKTFSDTKSEITKHYSKPNWPVEVLPIFPDFTNWKFPCAQVIFDSDPAPAGKNVPAQLE 239 

Qy 240 MMSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNWNV 299 

I I M M I I : M I I I I M M I I I : I I : I I : I I - - I • M I M I I I I M 

Db 240 EMSQAMIRGVMDESGEQFVAYFLPTEQTLEKRRTDFINGELYKEEEEYEYKIAREYNWNV 299 

Qy 300 KNKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMNEK 359 

| | II I I I I II I I I : I : II : I I M II I II I I : I I I I I I I MINI : : 
Db 300 KTKASKGYEENYFFVMRQ-DGIYYNELETRVRLNKRRVKVG-QQPNNTKLWKHRPLDSM 357 

Qy 360 ELEAQEARKAQLENHEPEEE EEEEM ETEE KEAGGSD 395 

I I |: III III 111:1 MM : I I : I 

Db 358 EH RMQ RYRERQ LEVP GE EEE I VEEVRE E EQMQ 1 1 GET EKT S EDAAVGAQAAS GAD S P AQV 417 

Qy 396 — EEQEKGSSSEKEGSEDEHSGSESEREEGDRDEASDKSGSGEDESSEDEARAARDKEEI 453 

: I : I : I I II I I | : : : M II : I : : 

Db 418 ARDRQSRSRSRTRSGS-SSGSGSGSGSRASSRSKSGSRSGSGSRSRTNSPAGSQKSGSR- 475 

Qy 454 FGSDADSEDDADSDDEDRGQAQGGSDNDSDSGS-NGGGQRSRSHSRSASPFPSGSEHSAQ 512 

|:| : | | : : : I : I I I I : I I M I I I I I I I I I = 

Db 476 SRSVSRSRSRSKSGSRSRSRSRSKSGSRSRSGSRSGSGSRSPSRSRSGSPSGSGSSSGSA 535 

Qy 513 ED 514 

I 

Db 536 SD 537 



RESULT 7 

US-10-450-763-50040 

; Sequence 50040, Application US/10450763 
; Publication No. US20050196754A1 
; GENERAL INFORMATION : 
; APPLICANT: Hyseq, Inc 

; TITLE OF INVENTION: NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 
; FILE REFERENCE: 790CIP3/US 

; CURRENT APPLICATION NUMBER: US/10/450,763 
; CURRENT FILING DATE: 2003-06-11 
; PRIOR APPLICATION NUMBER: PCT/US01/08631 
; PRIOR FILING DATE: 2001-03-30 
; PRIOR APPLICATION NUMBER: 09/540,217 
; PRIOR FILING DATE: 2000-03-31 
; PRIOR APPLICATION NUMBER: 09/649,167 
; PRIOR FILING DATE: 2000-08-23 
; NUMBER OF SEQ ID NOS : 60736 
; SOFTWARE: Custom 
; SEQ ID NO 50040 
LENGTH: 133 



; TYPE: PRT 

; ORGANISM: Homo sapiens 
US-10-450-763-50040 

Query Match 22.5%; Score 622; DB 5; Length 133; 

Best Local Similarity 64.4%; Pred. No. 7.2e-32; 

Matches 130; Conservative 0; Mismatches 2; Indels 70, Gaps 1, 

RDQEEEMDYAPDDVYDYKIAREYNWNVKNKASKGYEENYFFIFREGDGVYYNELETRVRL 332 

| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I , 5 
RDQEEEMDYAPDDVYDYKIAREYNWNVKNKASKGY 

SKRRAKAGVQSGTNALLVVKHRDMNEKELEAQEARKAQLENHEPEEEEEEEMETEEKEAG 392 
EEEEEEMETEEKEAG 50 



Qy 


273 


Db 


1 


Qy 


333 


Db 


36 


Qy 


393 


Db 


51 


Qy 


453 


Db 


111 



GSDEEQEKGSSSEKEGSEDEHSGSESEREEGDRDEASDKSGSGEDESSEDEARAARDKEE 452 

II I Ml I I I I M I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I M I I I I I I I II I I I I I 
GSYEEQEKGSSSEKEGSEDEHSGSESEREEGDRDEASDKSGSGEDESSEDEARAARDKEE 110 



M I I I I I M I I I I I I I M I I I 



RESULT 8 

US-10-424-599-223174 

Sequence 223174, Application US/10424599 
Publication No. US20040031072A1 
GENERAL INFORMATION: 
APPLICANT: La Rosa Thomas J 
APPLICANT: Kovalic David K 
APPLICANT: Zhou Yihua 

SSS^JwSSlST'lS Nucleic Acid Molecule and other Molecule, associated 

Wlt £lTLE OF INVENTION: Plants and Uses Thereof for Plant Improvement 
FILE REFERENCE: 38-21 ( 53223) B 
CURRENT APPLICATION NUMBER: US/10/424,599 
CURRENT FILING DATE: 2003-04-28 
NUMBER OF SEQ ID NOS : 285684 
SEQ ID NO 223174 
LENGTH: 86 
TYPE: PRT 

ORGANISM: Glycine max 
FEATURE * 

OTHER INFORMATION: Clone ID: PAT_MRT3847_43557C. 1 .pep 
US-10-424-599-223174 

Query Match 16.4%; Score 452; DB 4; Length 86; 

Best Local Similarity 100.0%; Pred. No. 2.5e-21; 

Matches 86; Conservative 0; Mismatches 0; Indels 0, Gaps 0, 
qv 66 LEKQHKHDLLTEPDLGWIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKRSQQHA. 125 

Db 1 LEKQHraDLLTEPDLGVTIDLINPDTYRIDPNVLLDPADEKLLEEEIQAPTSSKRSQQHA 60 



Qy 

Db 



126 KWPWMRKTEYI STEFNRYGI SNEKP 151 

I INN M I II I I i I Ml I I Ml II I 

61 KWPWMRKTEYI STEFNRYGI SNEKP 86 



RESULT 9 

US-10-424-599-213359 

Sequence 213359, Application US/10424599 
Publication No. US20040031072A1 
GENERAL INFORMATION: 
APPLICANT: La Rosa Thomas J 
APPLICANT: Kovalic David K 
APPLICANT: Zhou Yihua 

STbvSi^ Nucleic Acid Molecules end Other Molecules Assorted 

Wlt £lTLE OF INVENTION: Plants and Uses Thereof for Plant Improvement 
FILE REFERENCE: 38-21 ( 53223) B 
CURRENT APPLICATION NUMBER: US/10/424 , 599 
CURRENT FILING DATE: 2003-04-28 
NUMBER OF SEQ ID NOS : 285684 
SEQ ID NO 213359 
LENGTH: 571 
TYPE: PRT 

ORGANISM: Glycine max 
FEATURE ' 

OTHER INFORMATION: Clone ID: PAT_MRT3847_34688C. 1 .pep 
US-10-424-599-213359 

Query Match 13-6%; Score 375.5; DB 4; Length 571; 

Best Local Similarity 27.2%; Pred. No. l-6e-15; 

Matches 128; Conservative 71; Mismatches 170; Indels 101, Gaps 19, 

14 GHRPNSHRTLP ER SGWCRVKYCNSLPDIPFDPKFITYPFDQNRFVQY 61 

150 GSRMGERRSTPLLGAERVENRLKKPTTFLCKLKFRNELPDPSAQPKLMASKKDKDQYAKY 209 
Ov 62 KATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDPNVL LDPADEKLLEEE IQA 114 

Db 210 titsleUkpklfv^pdlgipldlldls^-ppsvrppiapedkellrddeavtpikk 268 
Ov lis ptsskrsqqhakwpwmrkteyistefnrygisnekpevkigvsvkqqfteeei 168 

• • • I I I : I I : I I I : I I I I I I I : : 

Db 269 DGIKRKERPTDKGVAWLVKTQYISP LSME STKQSLTEKQAKELREM 314 

169 ykdrdsqitaiektfedaqksisqhyskprvtpvevmpvfpdfkmwinpca 219 

QY I • I I I I : I I I I I I : : I I I I I I : I I I = : 

Db 315 KGGRGILDNLNSRERQIREIEASFE-AAKSDPVHATNKDLYPVEVMPLLPDFDRYDDQFV 373 

Ov 220 QVIFDSDP APKDTSGAAALE MMSQAMIRGMMDEEGNQFVAYFLPVEETLK 269 

I I • I III M : I : I : I : I I : I I 

Db 374 VAAFDNAPTADSEMHAKMDKSVRDAFESKAVMKSYVATGSDPANPEKFLAYMVPAPGELS 433 

Qy 270 KRKRDQEEEMDYAPDDVYDYKIAREYNWNVKNKASKGYEENYFFI FREGDGVYYNELETR 329 
Db 434 ioiYDENEDVSYS wIREYhIoWGODADD-PATFLVAFDESEARYL-PLPTK 483 



Qy 

Db 



Qy 

Db 

Qy 

Db 



330 VKLSKRRAKAGVQSGTNALLWKHRDMNEKELEAQEARKAQLENHEPEEEEEEEMETEEK 389 

: I I : I I I I : I I I : : II 

484 LVLRKKRAKEG-RSG DEVEQCPVPARVTVRRRSSVAAIERK 523 

390 EAGGSDEEQEKGSSSEK EGSEDEHSGS ESEREEGDRDEASD 430 

: : | | | : | | : : : I I I : I I : : : : I I II 

524 DSG — VYTSSKGNSSKRGGLEMDDGLEDQHRGAPHQDNYQSSGAEDYMSD 571 



RESULT 10 

US-10-437-963-116147 

Sequence 116147, Application US/10437963 
Publication No. US20040123343A1 
GENERAL INFORMATION : 
APPLICANT: La Rosa, Thomas J. 

Kovalic, David K. 
Zhou, Yihua 
Cao, Yongwei 
Wu, Wei 

Boukharov, Andrey A. 
Barbazuk, Brad 
Li, Ping 

Rice Nucleic Acid Molecules and Other Molecules 



APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
TITLE OF INVENTION: 



Associated With 

TITLE OF INVENTION: Plants and Uses Thereof for Plant Improvement 
FILE REFERENCE: 38-21 ( 53221 ) B 
CURRENT APPLICATION NUMBER: US/ 10/437 , 963 
CURRENT FILING DATE: 2003-05-14 
NUMBER OF SEQ ID NOS : 204966 
SEQ ID NO 116147 
LENGTH: 644 
TYPE: PRT 

ORGANISM: Oryza sativa 
FEATURE : 

OTHER INFORMATION: Clone ID: PAT_MRT4530_19676C . 1 . pep 
US-10-437-963-116147 



Query Match 12.2%; Score 337.5; DB 4; Length 644; 

Best Local Similarity 25.3%; Pred. No. 4.8e-13; 
Matches 109; Conservative 82; Mismatches 156; Indels 



83; Gaps 17; 



Qy 


17 


Db 


228 


Qy 


74 


Db 


288 


Qy 


102 


Db 


348 


Qy 


150 


Db 


408 



1 1 



I : I : I III 



I : : I : : I : I I I I I 



-LD 101 



I I I : I I 



I : I 



I I 



I I 



I : I : I I : I I 



I I : I I I I I : I III 



I :| 



|:|: I 



Qy 210 DFKMWINPCAQVI FDSDPAPKDTSGAAALE MMSQAMI RGMMDEEGNQFV 258 

II:: I I I I I I : II : | : : I : : | : 

Db 462 DFDRYDDQFVMVNFDGDPT-ADSEQYNKLERSERDECESRAVMKSFLVNGSDPAKQEKFL 520 

Qy 259 AYFLPVEETLKKRKRDQEEEMDYAPDDWDYKIAREYNW^^V^<NKASKGYEENYFFIFREG 318 

II :| II |: |:: |: II hi h I I I : 

Db 521 AYMVPSPHELSKDLDDETEDIQYS WLREYHWEVRGD-DKDDPTTYLVTF-DD 570 

Qy 319 D GVY YN E LET RVRL S KRRAKAGVQS GTN AL LWKH RDMN EKE L EAQ EARKAQ L EN HE 375 

II I I I : : I I : : I I I : I I : : I : : I ::::::: 

Db 571 DGAKYLPLPTKLVLQKKKAKEG-RSGDE IEHFPVPSRITENLKRQRSSVDDDLYDH 625 

Qy 376 PEEEEEEEME 385 

I : I : I : 
Db 626 PKHSRVEDMD 635 



RESULT 11 

US-10-450-763-49771 

; Sequence 49771, Application US/10450763 
; Publication No. US20050196754A1 
; GENERAL INFORMATION: 
; APPLICANT: Hyseq, Inc 

; TITLE OF INVENTION: NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 
; FILE REFERENCE: 790CIP3/US 

; CURRENT APPLICATION NUMBER: US/10/450,763 

; CURRENT FILING DATE: 2003-06-11 

; PRIOR APPLICATION NUMBER: PCT/US01/08631 

; PRIOR FILING DATE: 2001-03-30 

; PRIOR APPLICATION NUMBER: 09/540,217 

; PRIOR FILING DATE: 2000-03-31 

; PRIOR APPLICATION NUMBER: 09/649,167 

; PRIOR FILING DATE: 2000-08-23 

; NUMBER OF SEQ ID NOS : 60736 

; SOFTWARE: Custom 

; SEQ ID NO 49771 

LENGTH: 475 
; TYPE: PRT 

; ORGANISM: Homo sapiens 
; FEATURE : 

NAME/ KEY: DOMAIN 
; LOCATION: (391) .. (429) 

OTHER INFORMATION: ELEMENT TRANSPOSABLE INSERTION PROTEIN TRANSPOSITION DNA 
; OTHER INFORMATION: domain identified by eMATRIX, accession number PD02455A, 
p-value= 

; OTHER INFORMATION: 1.450e-25, raw score of 25.65 
FEATURE: 

NAME/ KEY: DOMAIN 
; LOCATION: (68) . . (113) 

; OTHER INFORMATION: Immunoglobulin domain identified by PFam, accession name 

ig/ 

; OTHER INFORMATION: E-value=0 . 099, PFam score of 10.6 
US-10-450-763-49771 



Query Match 10.2%; Score 283; DB 5; Length 475; 

Best Local Similarity 87.7%; Pred. No. 9.4e-10; 

Matches 57; Conservative 3; Mismatches 5; Indels 0; Gaps 



0; 



Qy 328 TRVRLSKRRAKAGVQSGTNALLVWHRDMNEKELEAQE^RKAQLENHEPEEEEEEEMETE 387 

: | | I I I I I I I I I I II M I I I I II I I I I I I I I I I I I I I I I I I I M I I I I M I I I I I I : 
D b 3 SRVRLSKRRAKAGVQSGTNALLVVKHRDMNEKELEAQEARKAQLENHEPEEEEEEEIRQP 62 

Qy 388 EKEAG 392 

1:1 

Db 63 RKKLG 67 



RESULT 12 

US-10-425-115-202470 

Sequence 202470, Application US/10425115 
Publication No. US20040214272A1 
GENERAL INFORMATION: 
APPLICANT: La Rosa, Thomas J. 
APPLICANT: Kovalic, David K. 
APPLICANT: Zhou, Yihua 
APPLICANT: Cao f Yongwei 

TITLE OF INVENTION : Nucleic Acid Molecules and Other Molecules Associated 
With 

TITLE OF INVENTION: Plants 
FILE REFERENCE: 38-21 ( 53222 ) B 
CURRENT APPLICATION NUMBER: US/10/425 , 115 
CURRENT FILING DATE: 2003-04-28 
NUMBER OF SEQ ID NOS : 369326 
SEQ ID NO 202470 
LENGTH: 286 
TYPE: PRT 

ORGANISM: Zea mays 
FEATURE : 

NAME/KEY: unsure 
LOCATION: (1) . . (286) 

OTHER INFORMATION: unsure at all Xaa locations 
FEATURE : 

OTHER INFORMATION: Clone ID: MRT4577_116243C. 1 . pep 
US-10-425-115-202470 

Query Match 8.6%; Score 238.5; DB 4; Length 286; 

Best Local Similarity 28.0%; Pred. No. 3.4e-07; 

Matches 76; Conservative 49; Mismatches 113; Indels 33; Gaps 9; 

VCRVKYCNSLPDIPFDPKFITYPFDQNRFVQYKATSLEKQHKHDLLTEPDLGVTIDLINP 89 

: | : I : I I I I I : : I : : I : : I : : I I I I : : : I I I : : II : : 

LCKHKFRNELPDPSAQLKWLPLNKDKDRYTKYRISSLEKNYLPKMIVPEDLGIPLDLLDM 82 

DTYRIDPNVL-LDPADEKLL-EEEIQAPTS SKRSQQHAKWPWMRKTEYI STEFNR 143 

I | | I I I : I I : : I : I I : : I : I : I I : I I I 
TVYNPPAAQLPLAPEDEELLRDDEVLTPVKPEGIRKKERPTDKGMSWLVKTQYISP 138 

YGISNEKPEVKIGVSVKQQFTEE EIYKDRDSQITAIEKTFEDAQKSISQHYS 195 

: | : : : | :: I I I I : I I I : : I : I I I I • 



| | : I : I I I : : I I I I II I : II I I I 



Qy 


30 


Db 


23 


Qy 


90 


Db 


83 


Qy 


144 


Db 


139 


Qy 


196 


Db 


196 



Qy 245 MI RGMMDEEGNQFVAYFLPVEETLKKRKRDQ 275 

: I : : : I I III: 
Db 255 XVSGSDPAKXREI LAYMXSS PHELVKDLDDE 285 



RESULT 13 

US-10-282-122A-70437 

Sequence 70437, Application US/10282122A 
Publication No. US20040029129A1 
GENERAL INFORMATION: 
APPLICANT: Wang, Liangsu 
APPLICANT: Zamudio, Carlos 
APPLICANT: Malone, Cheryl 
APPLICANT: Haselbeck, Robert 
APPLICANT: Ohlsen, Kari 
APPLICANT: Zyskind, Judith 
APPLICANT: Wall, Daniel 
APPLICANT: Trawick, John 
APPLICANT: Carr, Grant 
APPLICANT: Yamamoto, Robert 
APPLICANT: Forsyth, R. 
APPLICANT: Xu, H. 

TITLE OF INVENTION: Identification of Essential Genes in Microorganisms 
FILE REFERENCE: ELITRA. 034A 

CURRENT APPLICATION NUMBER: US/ 10/282 , 122A 
CURRENT FILING DATE: 2003-02-20 
PRIOR APPLICATION NUMBER: 60/191,078 
PRIOR FILING DATE: 2000-03-21 
PRIOR APPLICATION NUMBER: 60/206,848 
PRIOR FILING DATE: 2000-05-23 
PRIOR APPLICATION NUMBER: 60/207,727 
PRIOR FILING DATE: 2000-05-26 
PRIOR APPLICATION NUMBER: 60/230,335 
PRIOR FILING DATE: 2000-09-06 
PRIOR APPLICATION NUMBER: 60/230,347 
PRIOR FILING DATE: 2000-09-09 
PRIOR APPLICATION NUMBER: 60/242,578 
PRIOR FILING DATE: 2000-10-23 
PRIOR APPLICATION NUMBER: 60/253,625 
PRIOR FILING DATE: 2000-11-27 
PRIOR APPLICATION NUMBER: 60/257,931 
PRIOR FILING DATE: 2000-12-22 
PRIOR APPLICATION NUMBER: 60/267,636 
PRIOR FILING DATE: 2001-02-09 
PRIOR APPLICATION NUMBER: 60/269,308 
PRIOR FILING DATE: 2001-02-16 

Remaining Prior Application data removed - See File Wrapper or PALM. 
NUMBER OF SEQ ID NOS : 78614 
SOFTWARE: Patentln version 3.1 
SEQ ID NO 70437 
LENGTH: 1633 
TYPE: PRT 

ORGANISM: Staphylococcus epidermidis 
US-10-282-122A-70437 



Query Match 



8.6%; Score 237.5; DB 4; Length 1633; 



Best Local Similarity 21.2%; Pred. No. 3.1e-06; 

Matches 131; Conservative 85; Mismatches 238; Indels 165, Gaps 25, 



Db 



Db 

Qy 

Db 



0v 48 FITYPFDQNRFVQYKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDP NVLLDP 102 

* . . I | | : | I : : I I I : : III • I : 1 

707 YVTLKDSNNRELQRVTTDQSGHYQFDNLQNGT— YTVEFAIPDNYTPSPANNSTNDAIDS 764 

103 ADEKLLEEEIQAPTSSKRSQQHAKV VPWMRKTEYI STEFNRYGI SNEKPEVK 154 

Y I . : : : : : | I : I : : I : I I IN 

Db 765 DGERDGTRKVWAKGTINNADNMTVDTGFYLTPKYNVGDYVWEDTNKDGIQDDNEKGISN 824 

Qy 155 IGVSVKQ QFTEEEIY KDRDSQ 175 

825 vWTLKNKNGDTIGTTTTDSNGKYEFTGLENGDYTIEFETPEGYTPTKQNSGSDEGKDSN 884 

176 ITAIEKTFEDA-QKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDP 227 

885 GTKTTWVTCDADNKTIDSGFYKPIYN LGDY-VWEDTNKDGIQDDSEKGISGVK 936 

228 -APKDTSGAA ALEMMSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDY- 281 

I I • I | : : | : I I : I : I II I : : ' 

937 VTLKDKNGNAIGTTTTDASGHYQFKGL — ENGSYTVEFETPSGYTPTKANSGQDITVDSN 994 



Db 

Qy 

Db 

Qy 

Db 

n „ 282 APDDVYD YKIAR EYNWNVKNK ASKGY 307 

07 I : I II : M I I I M 

Db 

Qy 

Db 

Qy 358 EKELEAQEARKAQLENH EPEEEEEEEMETEEKEAGGSD 395 

Db 1101 EKDADGEDV^-VTITDHDDFSIDNGYFDDDSDSDSDADSDSDSDSDSDADSDSDADSNSD 1159 

Ov 396 EEQEKGSSSEKEGSEDEHSGSESERE-EGDRDEASDK-SGSGEDESSEDEARAARDKEEI 453 

Y | | • : | | | : | : : : I I II II I I : : : : I ' 

1160 SDSDSDSDSDSDSDSDSDSDSDSDSDADSDSDADSDSDSDSDSDSDSDSDSDSDSDSDSD 1219 



I : I I I : : I I I I ' ' 

995 GITTTGIINGADNLTIDSGFYKTPKYSVGDYVWEDTNKDGIQDDNEKGISGVKVTLKDEK 1054 

308 EENYFFI FREGD- GVYYNELETRVRLS KRRAKAGVQS GTNALLWKHRDMN 357 

: I I : I I I I I : : : I : I : : 

1055 GNIISTTTTDENGKYQFDNLDSGNYIIHFEKPEGMTQTTANSG NDD 1100 



454 FGSDADSEDDADSDDEDRGQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQE 513 
Mill: Mill : : I hill I h I I I I I I I' : 



Mill: Mill : ' 11*1111 I : i i i i i i ■ • 

1220 SDSDADSDSDADSDSDADSDSDADSDSDSDSDSDAD SDSDSDSDSDADSDSDSDSDS 1276 



Qy 514 DGSEAAASDS-SEADSDSD 531 

I : I I I I : I I I I I I 
Db 1277 DADSDSDSDSDSDADSDSD 1295 



RESULT 14 
US-10-615-383-4 

; Sequence 4, Application US/10615383 

; Publication No. US20040038327A1 

; GENERAL INFORMATION: 

; APPLICANT: FOSTER, Timothy 

; TITLE OF INVENTION: POLYPEPTIDES AND 

STAPHYLOCOCCI 

; FILE REFERENCE: P06335US03/BAS 



POLYNUCLEOTIDES FROM COAGULASE-NEGATIVE 



CURRENT APPLICATION NUMBER: US/10/615, 383 
CURRENT FILING DATE: 2003-07-09 
PRIOR APPLICATION NUMBER: 09/386,962 
PRIOR FILING DATE: 1999-08-31 
PRIOR APPLICATION NUMBER: 60/098,443 
PRIOR FILING DATE: 1998-08-31 
PRIOR APPLICATION NUMBER: 60/117,119 
PRIOR FILING DATE: 1999-01-25 
NUMBER OF SEQ ID NOS: 39 
SOFTWARE: Patentln version 3.1 
SEQ ID NO 4 
LENGTH: 1742 
TYPE: PRT 

ORGANISM: Staphylococcus epidermidis 
US-10-615-383-4 

Query Match 8.5%; Score 234.5; DB 4; Length 1742; 

Best Local Similarity 21.0%; Pred. No. 5.2e-06; 

Matches 127; Conservative 88; Mismatches 240; Indels 149, Gaps 24, 

Qy 48 FITYPFDQNRFVQYKATSLEKQHKHDLLTEPDLGWIDLINPDTYRIDP NVLLDP 102 

Db 716 YVTLKDSNNRELQRVTTO YTVEFAIPDNYTPSPANNSTNDAIDS 773 

103 ADEKLLEEEIQAPTSSKRSQQHAKV VPWMRKTEYISTEFNRYGISNEKPEVKIG 156 

I . . . ... | | : I : : I : II = : : I 

774 DGE^GTR^CVWAKGTINNADNMTVDTGFYLTPKYNVGDYVWEDTNKDGIQDDNEKGISG 833 



QY 
Db 

Qy 157 VSV KQQFT EEEIY KDRDSQ 175 

Db 

Qy 

Db 



834 VKVTLKNKNGDTIGTTTTDSNGKYEFTGLENGDYTIEFETPEGYTPTKQNSGSDEGKDSN 893 

176 ITAIEKTFEDA-QKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDP 227 

894 GTKTTVTVKDADNKTIDSGFYKPTYN LGDY-VWEDTNKDGIQDDSEKGISGVK 945 

Qy 228 -APKDTSGAA ALEMMSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDY- 281 

946 vTl1dKNGNAIGTTT T DASGHYQFKgL--ENGSYTVEFETPSGYTPTKANSGQDITVDSN 1003 



Db 



2 8 2 APDDVYD YKIAR EYNWNVKNK- 



-ASKGY 307 



Db 

Qy 



Qy 282 AfUUVIU— imaiv" ^p.. ... ! ( 

1004 GITTTGIINGADNLTIDSGFYKTPKYSVGDYVWEDTNKDGIQDDNEKGISGVKVTLKDEK 1063 

308 eenyfFIFREGD-GVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMN 357 

• | | : I III I : : : I : I : : 

Db 1064 GNIISTTTTDENGKYQFDNLDSGNYIIHFEKPEGMTQTTANSG NDD 1109 

Qy 358 EKELEAQEARKA- qlENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSE 410 

Db mo EKDADGEDVRVTITDHDDFSIDNGYFDDDSD^ 1169 

0v 411 DEHSGSESERE-EGDRDEASDK-SGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDD 468 

I I I • I • • • I I II II I I : : : = I • I I : I I : I : I I I 
Db 1170 DSDSDSDSDSDADSDSDSDSDSDSDSDSDADSDSDSDSDSDADSDSDSDSDSDSDSDSDS 1229 

Qy 469 EDRGQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDS-SEAD 527 



: : I I : II I I I : I I : I I I h ♦ I : I I I I : I I 

Db 1230 DSDSDSDSDSDSDSDSDSDSD SDSDADSDSDADSDSDSDSDSDADSDSDSDSDSDAD 1286 

Qy 528 SDSD 531 

II I I 

Db 1287 SDSD 1290 



RESULT 15 
US-10-690-184-4 

Sequence 4, Application US/10690184 
Publication No. US20040141997A1 
GENERAL INFORMATION: 
APPLICANT: FOSTER, Timothy 

TITLE OF INVENTION: METHODS FOR TREATING OR PREVENTING INFECTIONS FROM 
COAGULASE- 

TITLE OF INVENTION: NEGATIVE STAPHYLOCOCCI 
FILE REFERENCE: P06335US05/BAS 
CURRENT APPLICATION NUMBER: US/10/690, 184 
CURRENT FILING DATE: 2003-10-21 
PRIOR APPLICATION NUMBER: 09/386,962 
PRIOR FILING DATE: 1999-08-31 
PRIOR APPLICATION NUMBER: 60/098,443 
PRIOR FILING DATE: 1998-08-31 
PRIOR APPLICATION NUMBER: 60/117,119 
PRIOR FILING DATE: 1999-01-25 
NUMBER OF SEQ ID NOS : 39 
SOFTWARE: Patent In version 3.1 
SEQ ID NO 4 
LENGTH: 1742 
TYPE: PRT 

ORGANISM: Staphylococcus epidermidis 
US-10-690-184-4 

Query Match 8.5%; Score 234.5; DB 4; Length 1742; 

Best Local Similarity 21.0%; Pred. No. 5.2e-06; 

Matches 127; Conservative 88; Mismatches 240; Indels 149; Gaps 24; 

Qy 48 FITYPFDQNRFVQYKATSLEKQHKHDLLTEPDLGVTIDLINPDTYRIDP NVLLDP 102 

: : I I I : I I : : I I I : : I I I I I : I 

Db 716 YVTLKDSNNRELQRVTTDQSGHYQFDNLQNGT— YTVEFAIPDNYTPSPANNSTNDAIDS 773 

Qy 103 ADEKLLEEEIQAPTSSKRSQQHAKV VPWMRKTEYISTEFNRYGISNEKPEVKIG 156 

| : : : : : : | I : | : : | : M : : : I 

D b 774 DGERDGTRKVWAKGTINNADNMTVT)TGFYLTPKYNVGDYWEDTNKDGIQDDNEKGISG 833 

Qy 157 VSV KQQFT — EEEIY KDRDSQ 175 

|| I : I I I I s s I I 

Db 834 VKVTLKNKNGDTIGTTTTDSNGKYEFTGLENGDYTIEFETPEGYTPTKQNSGSDEGKDSN 893 

Qy 176 ITAIEKTFEDA-QKSISQHYSKPRVTPVEVMPVFPDFKMWINPCAQVIFDSDP 227 

I I : I I I : I : I I I : : I : I I 

Db 894 GT KTTVT VKD ADN KTIDSGFYKPTYN LGDY-VWEDTNKDGIQDDSEKGI SGVK 945 

Qy 228 -APKDTSGAA ALEMMSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDY- 281 

| | : I I : : I : I I : I : I I I I : • I 

Db 946 VTLKDKNGNAI GTTTTDASGHYQFKGL — ENGSYTVEFETPSGYTPTKANSGQDITVDSN 1003 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



282 APDDVYD YKIAR EYNWNVKNK ASKGY 307 

I : I II: : I I II II 
1004 GITTTGIINGADNLTIDSGFYKTPKYSVGDYVWEDTNKDGIQDDNEKGISGVKVTLKDEK 1063 

308 EENYFFIFREGD-GVYYNELETRVRLSKRRAKAGVQSGTNALLWKHRDMN 357 

: I I : I III I : : : I : I : : 

1064 GNIISTTTTDENGKYQFDNLDSGNYIIHFEKPEGMTQTTANSG NDD 1109 

358 EKELEAQEARKA QLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSE 410 

||: : :: I ::| ::::::: II : : I :: : 

1110 EKDADGEDVRVTITDHDDFSIDNGYFDDDSDSDSDADSDSDSDSDSDADSDSDADSDSDA 1169 

411 DEHSGSESERE-EGDRDEASDK-SGSGEDESSEDEARAARDKEEIFGSDADSEDDADSDD 468 
I | | : | : : : | | | I II I I : : : : I : I I : I I : I : I I I 
1170 DSDSDSDSDSDADSDSDSDSDSDSDSDSDADSDSDSDSDSDADSDSDSDSDSDSDSDSDS 1229 

469 EDRGQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDS-SEAD 527 

: : 11:11111: I I : I I I h « I : I I I I : I I 

1230 DSDSDSDSDSDSDSDSDSDSD SDSDADSDSDADSDSDSDSDSDADSDSDSDSDSDAD 1286 

528 SDSD 531 
I I I I 

1287 SDSD 1290 



Search completed: April 25, 2006, 09:26:08 
Job time : 166 sees 



GenCore version 5.1.7 
Copyright (c) 1993 - 2006 Biocceleration Ltd. 



OM protein - protein search, using sw model 
Run on: April 25, 2006, 09:23:34 



Search time 26 Seconds 

(without alignments) 

898.675 Million cell updates/sec 



Title: 

Perfect score: 
Sequence: 

Scoring table: 



Searched: 



US-10-721-553-2 
2764 

1 MAPTIQTQAQREDGHRPNSH. 



. QEDGSEAAASDSSEADSDSD 531 



BLOSUM62 
Gapop 10.0 



Gapext 0 . 5 



225428 seqs, 44002918 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : 



225428 



Published_Applications_AA_New: * 

1: /SIDS5/ptodata/l/pubpaa/US08_NEW_PUB.pep: * 

2: /SIDS5/ptodata/l/pubpaa/US06_NEW_PUB.pep:* 

3: /SIDS5/ptodata/l/pubpaa/US07_NEW_PUB.pep: * 

4 : /SIDS5/ptodata/l/pubpaa/PCT_NEW_PUB.pep: * 

5: /SIDS5/ptodata/l/pubpaa/US09_NEW_PUB.pep:* 

6: /SIDS5/ptodata/l/pubpaa/US10_NEW_PUB.pep: * 

7 : /SIDS5/ptodata/l/pubpaa/USll_NEW_PUB.pep: * 

8: /SIDS5/ptodata/l/pubpaa/US60_NEW_PUB.pep:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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5. 


6 


735 


7 


us- 


11- 


096- 


568A-29645 


Sequence 


29645/ A 


39 


154 .5 
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7 
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096- 
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5. 
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7 
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11- 


096- 


568A-31567 
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43 
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5. 


6 
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7 


us- 


11- 


024- 
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44 
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5. 


5 
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7 


us- 


11- 


087- 
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45 
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5. 


5 
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7 
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11- 


087- 
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ALIGNMENTS 



RESULT 1 

US-10-485-517-351 

; Sequence 351/ Application US/10485517 

; Publication No. US20050256299A1 

; GENERAL INFORMATION : 

; APPLICANT: University of Sheffield 

; APPLICANT: Biosynexus Incorporated 

; APPLICANT: Foster, Simon 

; APPLICANT: Mond f James 

; TITLE OF INVENTION: Antigenic Polypeptides 

; FILE REFERENCE: P100629WO 

; CURRENT APPLICATION NUMBER: US/10/485,517 

; CURRENT FILING DATE: 2004-02-02 

; PRIOR APPLICATION NUMBER: GB 0118825.9 

; PRIOR FILING DATE: 2001-08-02 

; PRIOR APPLICATION NUMBER: GB 0200349.9 



; PRIOR FILING DATE: 2002-01-09 
; NUMBER OF SEQ ID NOS : 424 

SOFTWARE: Patentln version 3.1 
; SEQ ID NO 351 
; LENGTH: 743 
; TYPE: PRT 

; ORGANISM: Staphylococcus aureus 
US-10-485-517-351 

Query Match 8.2%; Score 226; DB 6; Length 743; 

Best Local Similarity 22.6%; Pred. No. l.le-06; 

Matches 125; Conservative 75; Mismatches 208; Indels 146; Gaps 23; 

Qy 33 VKYCNSLPDI PF-DPKFITYPFDQNRFVQYKATSLEKQHKHDLLTEPDLGVTI DLIN 88 

I I I I : I I I I I I I I : I : I I : I I : I 

Db 137 VDYSNSNNTMPIADIK STNGDWAKAT YDILTKTYTFVFTDYVNNKE 183 

Qy 89 PDTYRI DPNVLLDPADEKLLEE EIQAPTSSKRS 121 

I : I I : : I I I : I I : I 

Db 184 NINGQFSLPLFTDRAKAPKSGTYDANINI — ADEMFNNKITYNYSSPIAGIDKPNGANIS 241 

Qy 122 QQHAKWPWMRKTEYI STEFNRYGI SNEKPEVKI GVS VKQQFTEEEI YKDRDSQI TAI E- 180 

II : I I I III I : :::| :: :::| : 

Db 242 SQIIGVDTASGQNTYKQTVF VNPKQRVLGNTWVYI KGYQDKI - EES SGKVSATDT 295 

Qy 181 — KTFE — DAQKSISQHYSKPRVTPV-EVMPVFPDFKMWINPCAQVI FDSDPAPKDTSGA 235 

: I I I I : | : | : : | | I : : : | | | 
Db 296 KLRI FEVNDTSKLSDS YYADPNDSNLKEVTDQFKNRI YYEHPNVASI KFGD 346 

Qy 236 AAL EMMS QAMIRGMMDEEGNQEVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKI ARE Y 295 

: : : | | | | : | : : | III: 
Db 347 — I TKT YWLVEGH YDNTG KNLKTQVIQENVDPVTNRDYSI F 386 

Qy 296 NWNWNKASKGYEENYFFIFREGDGWYNELETRVTUjSKRR 355 

I I : I : I I I I : I 
Db 387 GWNNEN WRYG GGSADGDSA 406 

Qy 356 MNEKE LEAQEARKAQLE NHEPEEEEEEEMETEEKEAGGSDEEQEK 400 

:| |: :: : : : | : I I I : : ::: | | | : 

Db 407 VNPKDPTPGPPVDPEPSPDPEPEPTPDPEPSPDPEPEPSPDPDPDSDSDSDSGSDSDSGS 466 

Qy 401 GSSSEKEGSEDEHSGSESERE-EGDRDEASD-KSGSGEDESSEDEARAARDKEEIFGSDA 458 

III: I I I : I : : I I I I I : I I I I : : : : I : I I : 
Db 467 DSDSESDSDSDSDSDSDSDSDSESDSDSESDSESDSDSDSDSDSDSDSDSDSDSDSDSDS 526 

Qy 459 DSEDDADSDDEDRGQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEA 518 

I I : I : I I I : : I I : I I I I I : I I I I I I I : : I 

Db 527 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD-SDSDSDSDSDSDSDSDSDSDSDSDSD 585 

Qy 519 AASDS-SEADSDSD 531 

: III I :: I I I I I 
Db 586 SDSDSDSDSDSDSD 599 



RESULT 2 

US-10-485-517-200 

; Sequence 200, Application US/10485517 



; Publication No. US20050256299A1 

; GENERAL INFORMATION: 

; APPLICANT: University of Sheffield 

; APPLICANT: Biosynexus Incorporated 

; APPLICANT: Foster , Simon 

; APPLICANT: Mond, James 

; TITLE OF INVENTION: Antigenic Polypeptides 
; FILE REFERENCE: P 10 062 9 WO 

; CURRENT APPLICATION NUMBER: US/10/ 485, 517 

; CURRENT FILING DATE: 2004-02-02 

; PRIOR APPLICATION NUMBER: GB 0118825.9 

; PRIOR FILING DATE: 2001-08-02 

; PRIOR APPLICATION NUMBER: GB 0200349.9 

; PRIOR FILING DATE: 2002-01-09 

; NUMBER OF SEQ ID NOS : 424 

SOFTWARE: Patentln version 3.1 
; SEQ ID NO 200 
LENGTH: 877 
TYPE: PRT 
; ORGANISM: Staphylococcus aureus 
US-10-485-517-200 

Query Match 8.2%; Score 226; DB 6; Length 877; 

Best Local Similarity 22.6%; Pred. No. 1.4e-06; 

Matches 125; Conservative 75; Mismatches 208; Indels 146; Gaps 23; 

Qy 33 VKYCNSLPDIPF-DPKFITYPFDQNRFVQYKATSLEKQHKHDLLTEPDLGVTIDLIN 88 

I I I I : I I I I I I I I : I : I I : I I : I 

Db 271 VDYSNSNNTMPIADIK STNGDWAKAT YD I LT KT YT FVFT D YVNNKE 317 

Qy 89 PDTYRIDPNVLLDPADEKLLEE EIQAPTSSKRS 121 

I : I I : : I I I : I I : I 

Db 318 NINGQFSLPLFTDRAKAPKSGTYDANINI— ADEMFNNKITYNYSSPIAGIDKPNGANIS 375 

Qy 122 QQHAKWPWMRKTEYI STEFNRYGI SNEKPEVKI GVSVKQQFTEEEI YKDRDSQITAI E- 180 

II : I I I III I : :::| :: . :::| : 

Db 376 SQIIGVDTASGQNTYKQTVF VNPKQRVLGNTWVYIKGYQDKI-EESSGKVSATDT 429 

Qy 181 — KTFE — DAQK S I SQH Y S KP RVT P V- EVMP VFP D FKMW I N P CAQ VI FD S D PAP KDT S GA 235 

: I I I I : | : | : : | | I : : : | II 
Db 430 KLRIFEVNDTSKLSDSYYADPNDSNLKEVTDQFKNRIYYEHPNVASIKFGD 480 

Qy 236 AALEMMSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYKIAREY 295 

: : : I | | | : | : : | III: 
Db 481 — ITKTYWLVEGHYDNTG KNLKTQVIQENVDPVTNRDYSI F 520 

Qy 296 NWNVKNKASKGYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLVVKHRD 355 

I I : I : II I I : I 
Db 521 GWNNEN WRYG GGSADGDSA 540 

Qy 356 MNEKE LEAQEARKAQLE NHEPEEEEEEEMETEEKEAGGSDEEQEK 400 

: I I : :: : : : I : I I I : : ::: Ml: 

Db 541 VNPKDPTPGPPVDPEPSPDPEPEPTPDPEPSPDPEPEPSPDPDPDSDSDSDSGSDSDSGS 600 

Qy 401 GSSSEKEGSEDEHSGSESERE-EGDRDEASD-KSGSGEDESSEDEARAARDKEEIFGSDA 458 

III: I I I : I : : I I I I I : I I I I : : : : I : I I : 
Db 601 DSDSESDSDSDSDSDSDSDSDSESDSDSESDSESDSDSDSDSDSDSDSDSDSDSDSDSDS 660 



Qy 459 DSEDDADSDDEDRGQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEA 518 

I I : I : I I I : : I I : I I I I I : I I I I I I I : : I 

Db 661 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD-SDSDSDSDSDSDSDSDSDSDSDSDSD 719 

Qy 519 AASDS-SEADSDSD 531 

: III I::lllll 
Db 720 SDSDSDSDSDSDSD 733 



RESULT 3 

US-10-799-749-78 

; Sequence 78, Application US/10799749 

; Publication No. US20060020391A1 

; GENERAL INFORMATION: 

; APPLICANT: Kreiswirth, Barry N 

; APPLICANT: Nadich, Steven M 

; TITLE OF INVENTION: System and Method for Tracking and Controlling Infections 
; FILE REFERENCE: 19124.0002 

; CURRENT APPLICATION NUMBER: US/10/799,749 

; CURRENT FILING DATE: 2004-03-15 

; NUMBER OF SEQ ID NOS : 80 

; SOFTWARE: Patentln version 3.1 

; SEQ ID NO 78 

LENGTH: 265 

TYPE: PRT 
; ORGANISM: Staphylococcus aureus 
US-10-799-749-78 

Query Match 8.1%; Score 224; DB 6; Length 265; 

Best Local Similarity 34.4%; Pred. No. 4.3e-07; 

Matches 55; Conservative 29; Mismatches 72; Indels 4; Gaps 4; 



Qy 375 EPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESERE-EGDRDEASDK-S 432 

III : : : : : III: III: I I I : I : : I I I II I 

Db 22 EPEPSPDPDPDSDSDSDSGSDSDSGSDSDSESDSDSDSDSDSDSDSDSESDSDSESDSDS 81 

Qy 433 GSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDNDSDSGSNGGGQR 492 

I I I : : : : I : I I : I I : :: I I I I : : I I : I I I I I : 
Db 82 DSDSDSDSDSDSESDSDSDSDSDSDSDSDSESDSDSESDSESDSDSDSDSDSDSDSDSD- 140 

Qy 493 SRSHSRSASPFPSGSEHSAQEDGSEAAASDS-SEADSDSD 531 

I I I I I II: :: I : I I I |::| I I I I 
Db 141 SDSDSDSDSDSDSDSDSDSESDSDSDSDSDSDSDSDSDSD 180 



RESULT 4 

US-11-185-924-18 

; Sequence 18, Application US/11185924 

; Publication No. US20060078945A1 

; GENERAL INFORMATION: 

; APPLICANT: Fisher et al . , Larry 

; TITLE OF INVENTION: Complex Formed by Small Integrin-Binding Ligand, 

; TITLE OF INVENTION: N-Linked Glycoproteins (SIBLINGS) and Factor H 

; FILE REFERENCE: 4239-61301-02 

; CURRENT APPLICATION NUMBER: US/11/185,924 

; CURRENT FILING DATE: 2005-07-19 



; PRIOR APPLICATION NUMBER: 09/958,617 

; PRIOR FILING DATE: 2002-01-18 

; PRIOR APPLICATION NUMBER: PCT/US00/09349 

; PRIOR FILING DATE: 2000-04-09 

; PRIOR APPLICATION NUMBER: 60/128,468 

; PRIOR FILING DATE: 1999-04-09 

; NUMBER OF SEQ ID NOS : 18 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 18 

LENGTH: 1253 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-11-185-924-18 

Query Match 8.0%; Score 220.5; DB 7; Length 1253; 

Best Local Similarity 23.1%; Pred. No. 4.7e-06; 

Matches 93; Conservative 57; Mismatches 160; Indels 93; Gaps 16; 

Qy 171 DRDSQITAIEKTFEDAQKSISQHYSKPRVTPV^VMPVFPDFKMWINPCAQVIFDSDPAPK 230 

I I I I llhll : I : I I I I 

Db 290 DHDSSI GQNSDSKEYYDPEGKE DPHNEV — DGDKTSK 324 

Qy 231 DTSGAAALEMMSQAMIRGMMDEEGNQFVAYFLPVEETLK KR KRDQEEE 278 

: I | : :: | : | : I : I I II I : 

Db 325 SEENSA GIPEDNGSQ RIEDTQKLNHRESKRVENRITKESETHA 367 

Qy 279 MDYAPDDVYDYKIAREYNWNVKNKASKGYE ENYFFI FREGDGVYYNE LETRV 330 

: : I : I I I : : I I I : I : I : I : I 

Db 368 VGKSQDKGIEIKGPSSGNRNITKEVGKGNEGKEDKGQHGMILGKGNVKTQGEWNIEGPG 427 

Qy 331 RLSKRRAKAGVQSGTNALLWKHRDMNEKELEAQEARKAQLENHEPEEEEE E 382 

: I: II II: II :: : :: :| :| 

Db 428 QKSEPGNKVG-HSNTGS DSNSDGYDSYDFDDKSMQGDDPNSSDESNGNDDANS 479 

Qy 383 EMETEEKEAG GSDEEQEKGSSSEKEGSEDEHSGSESEREEGDR DEASDK 431 

I : I I I I : : I : I : : I : I I : I I I : I :: :|| 

Db 480 ESDNNSSSRGDASYNSDESKDNGNGSDSKGAEDDDSDSTSDTNNSDSNGNGNNGNDDNDK 539 

Qy 432 SGSGE DESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDNDSDSGSNG 488 

III: II I : :: thill III : : I II I I : 

Db 540 SDSGKGKSDSSDSDSSDSSNSSDSSDSSDSDSSDSNSSSDSDSSDSDSSDSSDSDS-SDS 598 

Qy 489 GGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 

II : I I : I : I : I I I I : : I I I I 

Db 599 SNSSDSSDSSDSSDSSDSSDSSDSKSDSSKSESDSSDSDSKSD 641 



RESULT 5 

US-10-793-626-468 

; Sequence 468, Application US/10793626 

; Publication No. US20050255478A1 

; GENERAL INFORMATION: 

; APPLICANT: KIMMERLY, WILLIAM JOHN 

; TITLE OF INVENTION: STAPHYLOCOCCUS EPIDERMIDIS NUCLEIC ACIDS AND PROTEINS 
; FILE REFERENCE: PU3480US 

; CURRENT APPLICATION NUMBER: US/10/793,626 
; CURRENT FILING DATE: 2004-03-04 



PRIOR APPLICATION NUMBER: 60/164,258 
PRIOR FILING DATE: 1999-11-09 
NUMBER OF SEQ ID NOS : 4472 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 468 
LENGTH: 287 
TYPE: PRT 

ORGANISM: Artificial Sequence 
FEATURE: 

OTHER INFORMATION: Description of Artificial Sequence: synthetic 
OTHER INFORMATION: amino acid sequence 
US-10-793-626-468 

Query Match 7.7%; Score 212.5; DB 6; Length 287; 

Best Local Similarity 29.2%; Pred. No. 2.4e-06; 

Matches 54; Conservative 38; Mismatches 78; Indels 15; Gaps 5; 

Qy 355 DMNEKELEAQEARKA QLENHEPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKE 407 

| :|:: : :| ::| ::| : : ::: || : : | |: : 

Db 91 DDDEQDADGEEVHVTITDHDDFSIDNGYYDDESDSDSDSDSDSDSDSDSDSDSDSDSDSD 150 

Qy 408 GSEDEHSGSESEREEGDRDEASDKSGSGEDESSEDEARAARDKEEIFGSDADSEDDADSD 467 

I I I : I : : I I I I I I I I : : : : I : I I : I I : I : I I I 
Db 151 SDSDSDSDSDSD-SDSDSDSDSD-SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 208 

Qy 468 DEDRGQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDS-SEA 526 

: : ||:|||||: I I I I I I I : : I : III I : : 

Db 209 SDSDSDSDSDSDSDSDSDSD SDSDSDSDSDSDSDSDSDSDSDSDSDSGSDSDSDS 263 

Qy 527 DSDSD 531 

I I I I I 

Db 264 DSDSD 268 



RESULT 6 

US-10-485-517-240 

; Sequence 240, Application US/10485517 
; Publication No. US20050256299A1 
; GENERAL INFORMATION: 

; APPLICANT: University of Sheffield 
; APPLICANT: Biosynexus Incorporated 
; APPLICANT: Foster, Simon 
; APPLICANT: Mond, James 

TITLE OF INVENTION: Antigenic Polypeptides 
; FILE REFERENCE: P100629WO 

; CURRENT APPLICATION NUMBER: US/10/485,517 

; CURRENT FILING DATE: 2004-02-02 

; PRIOR APPLICATION NUMBER: GB 0118825.9 

; PRIOR FILING DATE: 2001-08-02 

; PRIOR APPLICATION NUMBER: GB 0200349.9 

; PRIOR FILING DATE: 2002-01-09 

; NUMBER OF SEQ ID NOS: 424 

SOFTWARE: Patentln version 3.1 
; SEQ ID NO 240 

LENGTH: 280 

TYPE: PRT 
; ORGANISM: Staphylococcus aureus 



US-10-485-517-240 



Query Match 7.5%; Score 206; DB 6; Length 280; 

Best Local Similarity 31.2%; Pred. No. 6e-06; 

Matches 50; Conservative 32; Mismatches 74; Indels 4; Gaps 4; 



Qy 375 EPEEEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESERE-EGDRDEASDK-S 432 

:::::::: II : : I |: : I I |:|: : : I I II I 

Db 36 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDS 95 

Qy 433 GSGEDESSEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDNDSDSGSNGGGQR 492 

I I I : : : : I : I I : I I : I : I I I : : I I I I I I I I : 
Db 96 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDNDSDSDSDSDSD- 154 

Qy 493 SRSHSRSASPFPSGSEHSAQEDGSEAAASDS-SEADSDSD 531 

I I I I I II: : I : I I I I I I I I I 
Db 155 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 194 



RESULT 7 

US-10-793-626-788 

; Sequence 788, Application US/10793626 
; Publication No. US20050255478A1 
; GENERAL INFORMATION: 

; APPLICANT: KIMMERLY, WILLIAM JOHN 

; TITLE OF INVENTION: STAPHYLOCOCCUS EPIDERMIDIS NUCLEIC ACIDS AND PROTEINS 
; FILE REFERENCE: PU3480US 

; CURRENT APPLICATION NUMBER: US/10/793, 626 
; CURRENT FILING DATE: 2004-03-04 
; PRIOR APPLICATION NUMBER: 60/164,258 
; PRIOR FILING DATE: 1999-11-09 
; NUMBER OF SEQ ID NOS : 4472 
; SOFTWARE: Patentln Ver. 2.1 
; SEQ ID NO 788 
LENGTH: 486 
; TYPE: PRT 

; ORGANISM: Artificial Sequence 
; FEATURE : 

; OTHER INFORMATION: Description of Artificial Sequence: synthetic 

; OTHER INFORMATION: amino acid sequence 

US-10-793-626-788 



Query Match 6.6%; Score 183; DB 6; Length 486; 

Best Local Similarity 32.2%; Pred. No. 0.00032; 

Matches 49; Conservative 29; Mismatches 66; Indels 8; Gaps 4; 

Qy 381 EEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESEREEGDRDEASDKSGSGEDESS 440 

: : : : I : I : : I I : : I I I : I : : I I I I I I I I 

Db 61 DNDLDTDIVSNSDSENDTYLDSDSDSDSDLDSDSDSDSD-SDSDSDSDSD-SDSDSDSDS 118 



Qy 441 EDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDNDSDSGSNGGGQRSRSHSRSA 500 

: : : : I : I I : M : I : I I I : : I I : I I I I I : I I I I 

Db 119 DSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD SDSDSDSD 173 

Qy 501 SPFPSGSEHSAQEDGSEAAASDS-SEADSDSD 531 

I II: : I : I I I I :: II I I I 

Db 174 SDSDSDSDSDSDSDSDSDSDSDSDSDSDSDSD 205 



RESULT 8 

US-11-087-099-9112 

; Sequence 9112, Application US/11087099 

; Publication No. US20060041961A1 

; GENERAL INFORMATION: 

; APPLICANT: Abad, Mark S. et al. 

; TITLE OF INVENTION: Genes and Uses for Plant Improvement 

; FILE REFERENCE: 38-21 ( 53450) B EP 

; CURRENT APPLICATION NUMBER: US/11/087 , 099 

; CURRENT FILING DATE: 2005-03-22 

; NUMBER OF SEQ ID NOS : 12464 

; SEQ ID NO 9112 

LENGTH: 1244 

TYPE: PRT 

; ORGANISM: Magnetococcus sp. MC-1 
US-11-087-099-9112 

Query Match 6.6%; Score 182; DB 7; Length 1244; 

Best Local Similarity 22.9%; Pred. No. 0.0012; 

Matches 80; Conservative 49; Mismatches 150; Indels 70; Gaps 9; 

Qy 212 KMWINPCAQVIFDSDPAPKDTSGAAA LEMMSQAMI RGMMDEEGNQFVAYFLPVEE- 266 

I : : I : t : I I I . I : I I I I : : I I : I I 

Db 60 KLKCSQCHHIFFQAPPEPKSAQPPASEQPGLEDESTAQDNDTESRDYAEFAFEESPLEEV 119 

Qy 267 TLK KRKRDQEEEMDYAPDDVYDYKIAREYNWNVKNKASKGYEENY 311 

II : II: : I : I I 
Db 12'0 DLDEIEKLTAQATLDMALEATRDKRQEPSFDEDTQVD 156 

Qy 312 FF I FRE GD GVY YN E LET RVRL S KRRAKAGVQ S GTNAL LWKH RDMN E KE LEAQEARKAQ L 371 

II II : I : : I : I : I I I I : I : I : 

Db 157 EDAVIEPSLEEVDVDQMIQAATALPTEPEAASEAEEELEAEEELEAEEEPEAEE 210 

Qy 372 E NHEPEEEE EEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESEREEGD 424 

I I I I I I llhlll: : | | : | : | : | : I : I I I : 

Db 211 EPEAEEEPEAEEEPEAEEELEAEEEPEAEEEPEAEEESEAEEESEAEEEPEAEEESEVEE 270 

Qy 425 RDEASDKSGSGEDESSEDEA — RAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDNDS 482 

I : : I : III II : II : : I : : : I. : I : I : : 
Db 271 APEVEEELELEEEAEEEPEAEEEAAPEAEEEAAPEVEEEPEVEEELELEEEAEEESEAE- 329 

Qy 483 DSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD 531 

: I : :| I I :| I III I I ::: 

Db 330 EESEAEEEAA PEAEEEAAPEAEEEAAPEAEEEAAPEAE 367 



RESULT 9 

US-10-821-234-905 

; Sequence 905 / Application US/10821234 

; Publication No. US20050255114A1 

; GENERAL INFORMATION: 

; APPLICANT: Labat, Ivan 

; APPLICANT: Stache-Crain, Birgit 

; APPLICANT: Andarmani, Susan 

; APPLICANT: Tang, Y. Tom 



; TITLE OF INVENTION: Methods for Diagnosis and Treatment of Preeclampsia 
; FILE REFERENCE: 821A 

; CURRENT APPLICATION NUMBER: US/10/821, 234 

; CURRENT FILING DATE: 2004-04-07 

; PRIOR APPLICATION NUMBER: US 60/462,047 

; PRIOR FILING DATE: 2003-04-07 

; NUMBER OF SEQ ID NOS : 1704 

; SOFTWARE: pt_SEQ_genes Version 1.0 

; SEQ ID NO 905 

; LENGTH: 697 

; TYPE: PRT 

; ORGANISM: Homo sapiens 
US-10-821-234-905 

Query Match 6.5%; Score 181; DB 6; Length 697; 

Best Local Similarity 21.9%; Pred. No. 0.00066; 

Matches 110; Conservative 68; Mismatches 184; Indels 140; Gaps 21; 

Qy 108 LEEEIQAPTSSKRSQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQF 163 

I I I : I : : I I I : I : : I I : I I : : : 

Db 83 LEGEARTPLAI PHT PWGRRPEEEAEDSGGPGEDRETLGLKTSSSLPEAWGLLD 135 

Qy 164 TEEEIYKDRDSQITAIEK TFEDAQKSISQHYSKPRVTPVEVMPVFPDFKMWINPCA 219 

:: :| :|:: I:: : III:: I: I : 
Db 136 DDDGMYGEREA— TSVPRGQGSQFADGQRA PLSPSLLI 171 

Qy 220 QVIFDSDPAPKDTSGAAALEMMSQAMIRGMMDEEGNQFVAY FLPV EET 267 

: : I I I : : I J : : I I I : I I II 

Db 172 RTLQGSDKNPGE EKAEEEGVAEEEGVNKFSYPPSHRECCPAVEEEDDEEA 221 

Qy 268 LKKRKRDQE EEMD YAP DD — VYD YKI ARE YNWNVKN KAS — K 305 



Db 222 VKKEAHRTSTSALSPGSKPSTWVSCPGEEENQATEDKRTERSKGARKTSVSPRSSGSDPR 281 

Qy 306 GYEENYFFIFREGDGVYYNELETRVRLSKRRAKAGVQSGTNALLV VKHRDMNEK 359 

: I : I I : I : I I I I I I : I I 

Db 282 SWE YRSGEASEEKEEKAHKETGKGEAAPGPQSSAPAQRPQLKSWWCQPSDEEEG 335 

Qy 360 ELEAQEARKAQLENH EPEEEEEEEMETEEKEAGGSDEEQE 399 

I : : I I : I I I I I I : I I I : : : I : I I I 

Db 336 EVTCALGAAEKDGEAECPPCIPPPSAFLKAWVYWPGEDTEEEEDEEEDEDSDSGSDEEEGE 395 

Qy 400 KGSSSEKEG SEDEHSGSESEREEGDRDEASDKSGSGEDESSEDEARAARDKEEI 453 

: I I I I : : I I I I I I I : I I I I I II I : 
Db 396 AEASSSTPATGVFLKSWVYQPGEDTEEEE DEDSD-TGSAEDE-REAETSASTPPASA 450 

Qy 454 F GSDADSEDDADSDDEDRGQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPS 505 

I I I : I : I I I I I : : : : : | : I : I I I 

Db . 451 FLKAWVYRPGEDTEEEEDEDVDSEDKEDDSEAALGEAESDPHPSHPDQRAHFRGWGYRP- 509 

Qy 506 GSEHSAQEDGSEAAASDSSEAD 527 

: I I I I I II: 
Db 510 GKETEEEEAAEDWGEAE 526 



RESULT 10 
US-10-469-469-250 



Sequence 250, Application US/10469469 
Publication No, US20060079493A1 
GENERAL INFORMATION: 
APPLICANT: FRITZ, LAWRENCE C. 
APPLICANT: BURROWS, FRANCIS J. 

TITLE OF INVENTION: METHODS FOR TREATING GENETICALLY-DEFINED PROLIFERATIVE 
TITLE OF INVENTION: DISORDERS WITH HSP90 INHIBITORS 
FILE REFERENCE: CON-0010-USN 
CURRENT APPLICATION NUMBER: US/10/469,469 
CURRENT FILING DATE: 2003-08-27 
PRIOR APPLICATION NUMBER: PCT/US02/ 06518 
PRIOR FILING DATE: 2002-03-01 
PRIOR APPLICATION NUMBER: 60/272,751 
PRIOR FILING DATE: 2001-03-01 
NUMBER OF SEQ ID NOS : 330 
SOFTWARE: Patent In Ver. 2.1 
SEQ ID NO 250 
LENGTH: 2004 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-469-469-250 



Query Match 6.4%; Score 178; DB 6; 

Best Local Similarity 21.1%; Pred. No. 0.0037; 
Matches 105; Conservative 65; Mismatches 163; 



Length 2004; 
Indels 164; 



Gaps 20; 



Qy 

Db 



109 EEEIQAPTSS KRSQQHAKWPWMRKTEYI STEFNRY 144 

III ::| II II :: II II II 
991 EEEPESPRSSSPPILTKPTLKRKKPFLHRRRRVRKRKHHNSSW TETIS 1039 



Qy 145 GISNEKPEVKIGVSVKQQFTEEEIYKDRDSQ — ITAIEKTFEDAQKSISQHYSKPRVTPV 202 

I II : I : : I I I : : Mill 
Db 1040 ETTEVL DEPFEDSDSERPMPRLEPTFE 1066 



Qy 203 EWIPVFPDFKMWINPCAQvi:FDSDPAPKDTSGAAALEM4SQAMIRGMMDEEGNQFVAYFL 262 

I : :| : |: : I : :: I 
Db 1067 IDEEEEEEDEN ELFPREYFRRLSSQD VL 1094 

Qy 263 PVEETLKKRKRDQEEEMDYAPDDVYDYKIAREYNW NVKN KASKGY 307 

: : I : : : I : I I I II I I : : : I I I I I I : 

Db 1095 RCQSSSKRKSKDEEE — DEESDDADDTPILKPVSLLRKRDVKNSPLEPDTSTPLKKKKGW 1152 

Qy 308 EENYFFI FREGDGVYYNELETR VRL S KRRAKAGVQ S GT - N ALL WKH RDMN E KEL EA 363 

: : : : : : | : | | : | : : : : | | : 

Db 1153 PKG KSRKPIHWKKRPGRKPGFKLSREIMPVSTQACVIEPIVSIPKAGRKPKIQES 1207 

Qy 364 QEARKAQLENHEPEE-EEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESEREE 422 

: I : : : I I I : I I I I I : I : I I : I I I I : I : I : I 

Db 1208 EETVEPKEDMPLPEERKEEEEMQAEAEEAEEGEEEDAASSEVPAASPADSSNSPETETKE 1267 

Qy 423 GDRDEASDKSG-SGEDESSEDEARAARDKEEIFGSDADSE DDADSDDEDRGQAQGG 477 

: : I : I II I I : I : : I I I : I I I I : I I I I I : 

Db 1268 PEVEEEEEKPRVSEEQRQSEEEQQELEEPEPEEEEDAAAETAQNDDHDADDEDDGHLEST 1327 

Qy 478 SDND SDSGSNGGGQRSRS HSRSASPFPSGSEHSA 511 

: : | : | : | | | I I I 

Db 1328 KKKELEEQPTREDVKEEPGVQESFLDANMQKSREKIKDKEETELDSEEEQPSHDTSWSE 1387 



Qy 512 QEDGSEAAASDSSEADS 528 

I III I III 

Db 1388 QMAGSE DDHEEDS 1400 



RESULT 11 

US-11-098-686-10232 

; Sequence 10232, Application US/11098686 
; Publication No. US20060024696A1 
; GENERAL INFORMATION: 

; APPLICANT: Kapur, Vivek and Gebhart, Connie J. 

; TITLE OF INVENTION: NUCLEIC ACID AND POLYPEPTIDE SEQUENCES 

; TITLE OF INVENTION: FROM LAWSONIA INTRACELLULARS AND METHODS OF USING 

; FILE REFERENCE: 09531-128001 

; CURRENT APPLICATION NUMBER: US/11/098,686 

; CURRENT FILING DATE: 2005-04-04 

; PRIOR APPLICATION NUMBER: PCT/US03/31318 

; PRIOR FILING DATE: 2003-10-01 

; PRIOR APPLICATION NUMBER: US 60/416,395 

; PRIOR FILING DATE: 2002-10-04 

; NUMBER OF SEQ ID NOS : 11433 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 10232 

LENGTH: 8746 

TYPE: PRT 

; ORGANISM: Lawsonia intracellulars 
US-11-098-686-10232 

Query Match 6.4%; Score 177; DB 7; Length 8746; 

Best Local Similarity 18.3%; Pred. No. 0.025; 

Matches 117; Conservative 94; Mismatches 219; Indels 210; Gaps 24; 

Qy 60 QYKAT — SLEKQHKHDLLTEPDLGVTIDLINPDTYRIDP NVLLDPADEKLLEEE 111 

I I : I : I : : : I III::: : III: II : I 

Db 7070 QYETTVQKIEQKYKEKKANRHILGCTLEELQEQEEKESKVAVGNFTVLLEKMREKQQKEL 7129 

Qy 112 IQAPTSSKRSQQHAKWPWMRKTEYISTEFNRYGISNEKPEVKIGVSVKQQFTEEEIYKD 171 

I I : I : I : I : I I : I I : I : I I : : 

Db 7130 QQFPSSDEDESD VDRRKKKKVT KTKAER NAQREQR KHHHTPHPYHPE 7176 

Qy 172 RDS QITAIEKTFEDAQKSISQHYSKPRVTPVEVMP 206 

: | | : : | : : : I I : : I 

Db 7177 ESTSHLVXDTKQIVTLTPSSSDQETPVQSKETATNETPIPSLPSTVKGLTLEEVTVTVLP 7236 

Qy 207 VFPDFKMWINPC AQVT FD S D PAP KDT S GAAAL EMMS QAMI RGMMD E 252 

I : II II I : I I I :: : : I I 

Db 7237 QPTQLSSLLSYFSSEPLQEQPCQVEEQQVSFESSQVQASTD EATIDPEVQLVLDE 7291 

Qy 253 EGNQ FVAY FL P VEET LKKRKRDQEEE — MD YAP DDVYD YKI ARE YNWNVKN KASK 305 

: : : : : : | : : : | | : | | I : : I I : II 

Db 7292 YSSKVAC LQQEMDKKLQEI EEKGTDS SAS S DT EWSWPKKDMPREI KTLK 7340 

Qy 306 GYEENYFFI FREG DGVYYNELETRVRLSKR 335 

I : II I : I : I I I 

Db 7341 GSDS EGKDQQEVPKIPSAGAHSLSSMAESEDVGAVSHIEKKKRKRHKKHKRSQ 7393 



Qy 336 RAKAGVQSGTNALLWKHRDMNEKELEAQEARKAQLENHEPEEEEEEE 383 

III : : I : : : : : I I I I I : I : I I : : I I I 
Db 7394 REKISRAKRALMAQYFAKVTLLGQECSEKVSEIQEQKKQKCEIEEKQRKEREEFFDQHQA 7453 

Qy 384 METEEKEAGG SDEEQEKGSSSEKEGSEDEHSGSESEREEGDRDEAS 429 

: : : : | | I : : : : : : : : I I : : I : I I : I 

Db 7454 ARNALLAKQQQELIGVAPEEVGPLQDKHRQEQQAQSRQWGLNLHSLIKQQRKE--RGLSS 7511 

Qy 430 DKSGSGEDESSEDEARAARDKEE IFGSDADSEDDADSDDEDRGQA 474 

I I I I II : I : I : : I : I : I I I I 

Db 7512 SISSSSSDEDIIDEGSQSDDQEDSKSLSSPISPPPSPPVSGADSQCIGGASSSDTDATMK 7571 

Qy 475 QGG SDNDSDSGSN GGGQRSRS-HSRSASPFP 504 

I I I :: I I I I I I I : I I 

Db 7572 S DGHKS PEVPVS S DKKEETGGNQS SKVTT YLLSVFTGKGGTAAGAQS S S SEHTGSKRQQP 7631 

Qy 505 SGSEHSAQE DGSEAAASDSSEAD 527 

III:::: : I : I I : : 

Db 7632 SGSDQTSKSSRQGPSTPFEGQGTSGVEGASGGAGDPGDGE 7671 



RESULT 12 

US-11-087-099-1159 

; Sequence 1159, Application US/11087099 

; Publication No. US20060041961A1 

; GENERAL INFORMATION : 

; APPLICANT: Abad, Mark S. et al . 

; TITLE OF INVENTION: Genes and Uses for Plant Improvement 

; FILE REFERENCE: 38-21 (53450) B EP 

; CURRENT APPLICATION NUMBER: US/11/087,099 

; CURRENT FILING DATE: 2005-03-22 

; NUMBER OF SEQ ID NOS : 12464 

; SEQ ID NO 1159 

LENGTH: 499 

TYPE: PRT 

; ORGANISM: Lycopersicon esculentum 
US-11-087-099-1159 

Query Match 6.3%; Score 173; DB 7; Length 499; 

Best Local Similarity 18.4%; Pred. No. 0.0014; 

Matches 91; Conservative 86; Mismatches 185; Indels 132; Gaps 17; 

Qy 101 DPADEKLLEEEIQA PTSSKRSQQHAKWPWMRKTEYISTEFNR — YGISNEKPEV 153 

I :| I |: I I I : I : : | : ::: | | | : : 

Db 6 DVIEEVLAGTEVPAIVHGVPKSTKKKKN LWEMEAQFMKTVLGRGSYSFFDNRRNK 60 

Qy 154 KIGVSVKQQFTEEEIYKDRDSQITAIEKTFEDAQKS ISQHYSKPR 198 

I : ||::::: I I : I I : I : : I 

Db 61 KKSSQLE'NVFQEKPDFENCNGWSTv^NRKKLPALKGSQIGIYWNLTKGSMMGPHWN-PM 119 

Qy 199 VTPVEV MPVFPDFKMWINPCAQVIFDSDP- 227 

I : : : I I I : I I I : I : : : 
Db 120 ATEIGIAIQGEGMWWCSKSGTGCKNMRFKVEEGDVFWPRF DPMAQMAFNNNSF 175 

Qy 228 APKDTSG-AAALEMMSQAMIRG MMDEEGNQFVAYFLP 263 

I : : I I : I : : : : : : : : I : : I 
Db 176 VFVGFSTTTKKHHPQYLTGKASVLRTLDRQILEASFNVGNTTMHQILEAQGDSVT LE 232 



Qy 

Db 



264 VEETLKKRKRDQEEE^DYAPDDVYDYKIAREYNWNVKNKASKGYEENYFFIFREGDGVYY 323 

:: I I I I I I :: : I I : : I : : : : I : 

233 CTSCAEEEKRLMEEEMRKEEEEAKKKEEARKAEEERREKEAEEERKR QEEEARKR 287 



Qy 



Db 




Qy 



Db 



378 — EEEEEEMETEEKEAGGSDEEQEKGSSSEKEGSEDEHSGSESEREEGDRDEASDKSGSG 435 

I I I I I : I : I : I : : I I : I : I I : I i I : 

335 RREEEEAEKRRQEEEESRREEKARRRQQEEARRREEEEAAKRQHEEEAER-EAEEARRIE 393 



Qy 



Db 



436 EDES SEDEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDNDSDS 484 

1:1: 1:11 II :M :|:: : ::| I I : :: :: 
394 EEEAQREAEEARRIQQEEEAERARRREE EAETRRKEEEEEESRRQEEESRRSEEEA 449 



Qy 



485 GSNGGGQRSRSHSR 498 



Db 



450 AREAERERQEEAER 463 



RESULT 13 
US-11-185-924-16 

; Sequence 16, Application US/11185924 

; Publication No. US20060078945A1 

; GENERAL INFORMATION: 

; APPLICANT: Fisher et al., Larry 

; TITLE OF INVENTION: Complex Formed by Small Integrin-Binding Ligand, 

; TITLE OF INVENTION: N-Linked Glycoproteins (SIBLINGS) and Factor H 

; FILE REFERENCE: 4239-61301-02 

; CURRENT APPLICATION NUMBER: US/11/185,924 

; CURRENT FILING DATE: 2005-07-19 

; PRIOR APPLICATION NUMBER: 09/958,617 

; PRIOR FILING DATE: 2002-01-18 

; PRIOR APPLICATION NUMBER: PCT/US0O/ 0934 9 

; PRIOR FILING DATE: 2000-04-09 

; PRIOR APPLICATION NUMBER: 60/128,468 

; PRIOR FILING DATE: 1999-04-09 

; NUMBER OF SEQ ID NOS : 18 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 16 

LENGTH: 513 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-11-185-924-16 

Query Match 6.2%; Score 171; DB 7; Length 513; 

Best Local Similarity 22.4%; Pred. No. 0.0019; 

Matches 66; Conservative 54; Mismatches 117; Indels 58; Gaps 11; 

Qy 231 DTSGAAALEMMSQAMIRGMMDEEGNQFVAYFLPVEETLKKRKRDQEEEMDYAPDDVYDYK 290 

: : : I : I : I I : I : : I : : I : : I : : I : 
Db 233 NSAGMKSKESGENSEQANTQDSGGSQLLEH — PSRKIFRKSRISEEDDRSELDDN 285 

Qy 291 IAREYNWNVT<NKASKGYEENYFFIFREGDGWYNE 350 



:| :| 



Db 286 NTMEEVKSDSTEN SNSRDTGLSQPRRDSKGDSQEDSKENL- 325 

Qy 351 VKHRDMNEKELEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSD EEQEKGS 402 

: I :: ::: | | : I I : I I : : I : I : I : I I 

Db 326 SQEESQNVDGPSSESSQEANLSSQENSSESQEEWSESR GDNPDPTTSYVEDQEDSD 382 

Qy 403 SSEKEGSED-EHSGSESEREEGDRDEASDKSGSGED-ESSEDEARAARDKEEIFGSDADS 460 

I I I : : I I I I I I hi:: : | | II IN : : : : : I hi 
Db 383 SSEEDSSHTLSHSKSESREEQADSESSESLNFSEESPESPEDENSSSQEGLQSHSSSAES 442 

Qy 461 EDDADSDDEDRGQAQGGSDNDSDSGSNGGGQRSRSHSRSASPFPSGSEHSAQEDG 515 

: : : I I I : I I I I t : I I I : I :: I I I 

Db 443 QSEESHSEED DSDSQDSS RSKEDSNSTE SKSSSEEDG 479 



RESULT 14 

US-11-096-568A-28367 

; Sequence 28367, Application US/11096568A 
; Publication No. US20060048240A1 
; GENERAL INFORMATION: 

; APPLICANT: Alexandrov, Nickolai et al . 

; TITLE OF INVENTION: Sequence-Determined DNA Fragments and Corresponding 

Polypeptides Enconded 

; TITLE OF INVENTION: Therby 

; FILE REFERENCE: 2750-1592PUS2 

; CURRENT APPLICATION NUMBER: US/11/096, 568A 

; CURRENT FILING DATE: 2005-04-01 

; NUMBER OF SEQ ID NOS : 34471 

; SEQ ID NO 28367 

; LENGTH: 447 

; TYPE: PRT 

; ORGANISM: Arabidopsis thaliana 
; FEATURE: 

NAME/KEY: misc_f eature 

LOCATION: (1)..(447) 
; OTHER INFORMATION: Ceres Seq. ID no. 2715782 
US-11-096-568A-28367 

Query Match 6.1%; Score 169.5; DB 7; Length 447; 

Best Local Similarity 26.2%; Pred. No. 0.002; 

Matches 59; Conservative 28; Mismatches 79; Indels 59; Gaps 10; 

Qy 356 MNEKELEAQEARKAQLENHEPEEEEEEEMETEEKEAGGS DEEQE KGSSSE 405 

: : : : I I : : I : I : I I : I I I I : : I II II I II 

Db 151 LDKTDAEGNERPESDDEDDEEDEEDEEEEEEGDEEDPGSGEIDGDERAEAPRMSNGHSER 210 



Qy 406 KEG SEDEHSGSESEREE GD — RDEASDKSGSGEDE 438 

: I I I I I : I : I I : I :: I I I I I : I 

Db 211 VDGVVT)V1)EDEESDAEDDESEQATGWGTSYRANGFRLEAVNGEEVREDDGDDSESGEEE 270 

Qy 439 SSED EARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDNDSDSGSNG-GGQRS 493 

II I II :: | | | : | : | | : : | : I I I I 

Db 271 VGEDNDWEVHEIEDSE NEEDGVDDEEDDEEDE EEEEVDNADRGLGGSGS 320 



Qy 4 94 RSHSRSASPFPSGSEHSAQEDG SEAAASDSSEADSDSD 531 

I : I : I I I : I I I I I 

Db 321 TSRLMNAGEIDGHEQGDDDEDGDGETGEDDQGVEDDGEFADEDDD 365 



RESULT 15 

US-11-087-099-9570 

Sequence 9570, Application US/11087099 
Publication No. US20060041961A1 
GENERAL INFORMATION: 
APPLICANT: Abaci, Mark S. et al. 

TITLE OF INVENTION: Genes and Uses for Plant Improvement 
FILE REFERENCE: 38-2 1 ( 53450 ) B EP 
CURRENT APPLICATION NUMBER: US/11/087 , 099 
CURRENT FILING DATE: 2005-03-22 
NUMBER OF SEQ ID NOS : 12464 
SEQ ID NO 9570 
LENGTH: 1758 
TYPE: PRT 

ORGANISM: Neurospora crassa 
US-11-087-099-9570 



Query Match 6.1%; Score 169; DB 7; Length 1758; 

Best Local Similarity 23.2%; Pred. No. 0.011; 

Matches 116; Conservative 62; Mismatches 178; Indels 144; Gaps 27; 



Qy 



Db 



94 I DPNVLLDPADEKLLEEEIQAPTS SKRSQQHAKWPWMRKTEY-I STEFNRYGI SNEKPE 152 
I : I I : I I I : : I I : t I I I I : I : I 
878 I EGNVHVI EVDPKKAEDSLQGQVDMK NYIMW KKEYEAATAYLR 921 



Qy 

Db 



153 VKIGVSVKQQFTEEEIYKDRDSQITAIEKTFEDAQKSISQHYS— KPRVTPVEVMPVFPD 210 

: I I I | : | : | : | : I I : I I : I I I : : I 

922 -MMGVDV TAQEV-KNEDEE EDGEKSKKKRRSKRKPKKKKKKKTTTGKD 967 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 



211 FKMWINPCAQVIFD SDPAPKDTSGAAALEMMSQ 243 

III | | | | | :: 

968 AQEYLDDEQTRNLEDGQDDKKDEQDDDNEKDERDDADEKDEDGQANVDEKD- 1018 

244 AMI RGMMDEEGNQFVAYFLPVEE TLKKRKRDQEEEMDYAPDDVYDYKI 291 

: I I I I I II : | | ::::::: | : | 
1019 KDEMDS EENS- DEYNTAAEEQS PQPKKS KINNAKRKRRKQRKQKAEQAAKE : KA 1070 



292 



345 



AREYNWNVK NKAS KGYEENYFFI FREGDGVYYNELETRVRLS KRRAKAGVQSGT 

I I I I I I I I I : : I I I I : : I II : 

1071 ERE-AWEKKKQEAKERKERKKREEEAQKAARKKEA REKETREK— EAREKAAREKEE 1124 



346 NAL 



398 



1125 



LWKHRDMNEKELEAQEARKAQLENHEPEEEEEEEMETEEKEAGGSDEEQ 

I: : : :: IN :|||: | |: I I III ::| 

KAMREKAAREI AARKKEAREKETREKEAREKAAREKEAREKTAREKEAREKEV- - REKEA 1182 



Qy 

Db 

Qy 

Db 

Qy 



399 EKGSSSEKEGSE-DEHSGSESEREEG-DRDEASDKSGSGEDESSEDEARAARDKEE 452 

I : I I : II : I I I I I : : : I : I : I I I : I I : : : I 
1183 RKKEAREKQVREKDAREKAAKEREEKVAREKEAQK — ARERQEQEKEAQKAREQQEQEAR 1240 



453 



-SNGGGQRS 493 



IFGSDADSEDDADSDDEDRGQAQGGSDNDSDSG 

I : : I I I : I I : : I I I : I II 
1241 ERKKLDEVI WEEKVNEDDI KQEDEVKEEA DNNPMSSQALMPSQQQMPGPGPGPMM 1296 

494 RSHSRSASPFPSGSEHSAQE 513 
I I I I I I : 



Db 1297 LGHQ IPPPPGLEPKPQQ 1313 

Search completed: April 25, 2006, 09:26:39 
Job time : 28 sees 



